Popular Posts

Popular Content

Powered by Blogger.

Search This Blog

Follow on Google+

Recent Posts

About us

Two runaway barges remain stuck in the Ohio River, according to officials in Louisville, Kentucky, who reiterated Wednesday there is still no evidence of a chemical leak as one of the barges is carrying the highly flammable compound methanol.

from CNN.com - RSS Channel https://ift.tt/5nVzOTy
Continue Reading

View CNN's Fast Facts to learn about the Arab League, an organization of Middle Eastern and African countries and the Palestine Liberation Organization (PLO).

from CNN.com - RSS Channel https://ift.tt/LDrQFPg
Continue Reading

Read Taliban Fast Facts on CNN and learn more about the Sunni Islamist organization operating primarily in Afghanistan and Pakistan.

from CNN.com - RSS Channel https://ift.tt/PR5qSy0
Continue Reading

Hey hacker news, we launched a few weeks ago as a GPT-powered chatbot for developer docs, and quickly realized that the value of what we’re doing isn’t the chatbot itself. Rather, it’s the time we save developers by automating the extraction of data from their SaaS tools (Github, Zendesk, Salesforce, etc) and helping transform it to contextually relevant chunks that fit into GPT’s context window.

A lot of companies are building prototypes with GPT right now and they’re all using some combination of Langchain/Llama Index + Weaviate/Pinecone + GPT3.5/GPT4 as their stack for retrieval augmented generation (RAG). This works great for prototypes, but what we learned was that as you scale your RAG app to more users and ingest more sources of content, it becomes a real pain to manage your data pipelines.

For example, if you want to ingest your developer docs, process it into chunks of <500 tokens, and add those chunks to a vector store, you can build a prototype with Langchain fairly quickly. However, if you want to deploy it to customers like we did for BentoML ([https://www.bentoml.com/](https://www.bentoml.com/)) you’ll quickly realize that a naive chunking method that splits by character/token leads to poor results, and that “delete and re-vectorize everything” when the source docs change doesn’t scale as a data synchronization strategy.

We took the code we used to build chatbots for our early customers and turned it into an open source framework to rapidly build new data Connectors and Chunkers. This way developers can use community built Connectors and Chunkers to start running vector searches on data from any source in a matter of minutes, or write their own in a matter of hours.

Here’s a video demo: [https://youtu.be/I2V3Cu8L6wk](https://youtu.be/I2V3Cu8L6wk)

The repo has instructions on how to get started and set up API endpoints to load, chunk, and vectorize data quickly. Right now it only works with websites and Github repos, but we’ll be adding Zendesk, Google Drive, and Confluence integrations soon too.


Comments URL: https://news.ycombinator.com/item?id=35375540

Points: 19

# Comments: 1



from Hacker News: Front Page https://ift.tt/I3hSa07
Continue Reading

We believe that AI should be fully open source and part of the collective knowledge.

The original LLaMA code is GPL licensed which means any project using it must also be released under GPL.

This "taints" any other code and prevents meaningful academic and commercial use.

Lit-LLaMA solves that for good.


Comments URL: https://news.ycombinator.com/item?id=35344787

Points: 27

# Comments: 9



from Hacker News: Front Page https://ift.tt/8NI5AHS
Continue Reading

Federal regulators are weighing allowing federal candidates boost the amount of political cash they can use to pay themselves salaries. Supporters say it would encourage more working-class Americans and caregivers to run. Critics say it could open the door to grift.

from CNN.com - RSS Channel https://ift.tt/3DL6fZX
Continue Reading

Hey HN! We launched Zapier way back in 2012 on HN: https://news.ycombinator.com/item?id=4138415 and thought we'd return home to announce something special and hopefully exciting :) We are trying to finally live up to the "API" in our name with Zapier's first universal API:

Natural Language Actions – https://zapier.com/l/natural-language-actions

API docs – https://nla.zapier.com/api/v1/docs

(to be fair, we have published APIs before that can access Zapier data, but never before one devs can use to directly call the 5k+ apps / 20k+ actions on our platform)

For example, you can use the API to:

  * Send messages in Slack
  * Retrieve a row in a Google Sheet
  * Draft a reply in Gmail
  * ... and thousands more actions with one universal API
We optimized NLA for use cases that receive user input in natural language (think chatbots, assistants, or any product/feature using LLMs) -- but not strictly required!

Folks have asked for an API for 10 years and I've always been slightly embarrassed we didn't have one. We hesitated because we did not want to pass along our universe of complexity to end devs. With the help of LLMs we found some cool patterns to deliver the API we always wanted.

My co-founder/CTO Bryan did an interview with Garry on YC blog with more details: https://www.ycombinator.com/blog/building-apis-for-ai-an-int...

We also published a LangChain integration to show off some possibilities:

  * Demo: https://www.youtube.com/watch?v=EEK_9wLYEHU
  * Jupyter notebook: https://github.com/hwchase17/langchain/blob/master/docs/modules/utils/examples/zapier.ipynb
We know the API is not perfect but we're excited and eager for feedback to help shape it.

Comments URL: https://news.ycombinator.com/item?id=35263542

Points: 30

# Comments: 8



from Hacker News: Front Page https://ift.tt/amjg7er
Continue Reading

Hey there HN! We're Esteban and Esteban and we are looking to get feedback for the new version of our GPT-powered, open-source code contextualizer.

We're starting with a VS Code extension that indexes information from git (GitHub, GitLab, or Bitbucket integrations available), Slack and Jira to explain the context around a file or block of code. Finally, we summarize such aggregated context using the power of GPT.

As devs we know that it's very annoying to look at a new codebase and start understanding all the nuances, particularly when the person who wrote the code already left the company. With this problem in mind, we decided to build this solution. You'll be able to get into "the ghost" of the person who left the company.

Soon, we will also be building a GitHub Action that does the same thing as the VS Code extension but at the time of creating a PR: Index the most relevant information related to this new PR, and add it as a comment. This way we will provide context at one more moment, and also, we will be making the IDE extension better.

Here's our open source repo if you also want to check it out: https://github.com/watermelontools/watermelon-extension

Please give us your feedback! Thanks.


Comments URL: https://news.ycombinator.com/item?id=35248704

Points: 9

# Comments: 1



from Hacker News: Front Page https://ift.tt/HIuxiyY
Continue Reading

The candidate who can best convince Americans that they can handle whatever crisis comes their way and bring back stability will be the odds-on favorite to win, writes Julian Zelizer. Stability is paramount at this moment in American history, and voters will be looking for the person who can bring about better times.

from CNN.com - RSS Channel https://ift.tt/zCaDLut
Continue Reading

Hi!

My employer recently announced a 10% pay cut across the board. Where I live(South Africa), employee consent is required.

The company sent out a document asking us to sign in agreement.

----

This is the state of things now:

* Company laid an unknown number of employees off, and laid them off without letting employees know until a week later in a meeting where they announced pay cuts

* The company is pretty much full remote and the office is a nice-to-have. It is in a very expensive part of town.

* Lunches once a week are catered at the office

* I asked what happens if one were not to sign, and the response was "Oh, we haven't thought about that. We're hoping that everyone pulls together."

* The situation will be reviewed quarterly

* Company says they don't expect it to last very long, also citing this for why they kept the office

----

A few things stand out to me and feel like red flags, namely:

* They chose to cut salaries rather than cut the office rent and catered lunch/snack expenses

* They have no plan should someone not sign. I would think they would have planned that out, specially since they went on about how long it took them to make this decision.

* Layoffs were hidden until an announcement, which was also ambiguous where people thought it was still coming.

----

My options are to sign and take a pay cut, or refuse to sign and see what happens. Law here says I am entitled to what effectively comes to a layoff, but I can't predict what the company will do.

The pay cut also makes my life a lot harder since we were already on a tight budget.

I would appreciate any thoughts, knowledge, or advice you might have.

I know you are not a lawyer and I am not expecting you to be, but lawyers can't speak to real world experience from others in the industry. I am currently finding a lawyer to assist.


Comments URL: https://news.ycombinator.com/item?id=35235474

Points: 6

# Comments: 9



from Hacker News: Front Page https://ift.tt/0WDLIna
Continue Reading

As the fight over abortion pills heats up nationally, Wyoming on Friday prohibited the medication in what NARAL Pro-Choice America called a "first of its kind" law, and also enacted a near-total ban on abortion.

from CNN.com - RSS Channel https://ift.tt/OBdnqUl
Continue Reading

Hey HN! Charles here from Prequel (https://prequel.co). We just launched the ability for companies to import data from their customer’s data warehouse or database, and we wanted to share a little bit more about it with the community.

If you just want to see how it works, here’s a demo of the product that Conor recorded: https://ift.tt/rPYgwnq.

Quick background on us: we help companies integrate with their customer’s data warehouse or database. We’ve been busy helping companies export data to their customers – we’re currently syncing over 40bn rows per month on behalf of companies. But folks kept on asking us if we could help them import data from their customers too. They wanted the ability to offer a 1st-party reverse ETL to their customers, similar to the 1st-party ETL capability we already helped them offer. So we built that product, and here we are.

Why would people want to import data? There are actually plenty of use-cases here. Imagine a usage-based billing company that needs to get a daily pull from its customers of all the billing events that happened, so that they can generate relevant invoices. Or a fraud detection company who needs to get the latest transaction data from its customers so it can appropriately mark fraudulent ones.

There’s no great way to import customer data currently. Typically, people solve this one of two ways today. One is they import data via CSV. This works well enough, but it requires ongoing work on the part of the customer: they need to put a CSV together, and upload it to the right place on a daily/weekly/monthly basis. This is painful and time-consuming, especially for data that needs to be continuously imported. Another one is companies make the customer write custom code to feed data to their API. This requires the customer to do a bunch of solutions engineering work just to get started using the product – which is a suboptimal onboarding experience.

So instead, we let the customer connect their database or data warehouse and we pull data directly from there, on an ongoing basis. They select which tables to import (and potentially map some columns to required fields), and that’s it. The setup only takes 5 minutes, and requires no ongoing work. We feel like that’s the kind of experience every company should provide when onboarding a new customer.

Importing all this data continuously is non-trivial, but thankfully we can actually reuse 95% of the infrastructure we built for data exports. It turns out our core transfer logic remains pretty much exactly the same, and all we had to do was ship new CRUD endpoints in our API layer to let users configure their source/destination. As a brief reminder about our stack, we run a GoLang backend and Typescript/React frontend on k8s.

In terms of technical design, the most challenging decisions we have to make are around making database’s type-systems play nicely with each other (kind of an evergreen problem really). For imports, we allow the data recipient to specify whether they want to receive this data as JSON blob, or as a nicely typed table. If they choose the latter, they specify exactly which columns they’re expecting, as well as what type guarantees those should uphold. We’re also working on the ability to feed that data directly into an API endpoint, and adding post-ingestion validation logic.

We’ve mentioned this before but it bears worth repeating. We know that security and privacy are paramount here. We're SOC 2 Type II certified, and we go through annual white-box pentests to make sure that all our code is up to snuff. We never store any of the data anywhere on our servers. Finally, we offer on-prem deployments, so data never even has to touch our servers if our customers don't want it to.

We’re really stoked to be sharing this with the community. We’ll be hanging out here for most of the day, but you can also reach us at hn (at) prequel.co if you have any questions!


Comments URL: https://news.ycombinator.com/item?id=35170410

Points: 23

# Comments: 2



from Hacker News: Front Page https://ift.tt/QmMWahT
Continue Reading

tl;dr we at Escape (YC W23), we scanned 5651+ public APIs on the internet with our in house feedback driven API exploration tech, and ranked them using security, performance, reliability, and design criteria. The results are public on https://apirank.dev. You can request that we index your own API to the list for free and see how it compares to others.

Why we did that?

During a YC meetup I spoke with a fellow founder that told me how hard it was to pick the right external APIs to use within your own projects.

I realized that most of what we build relies on public APIs from external vendors, but there was no benchmark to help developers compare and evaluate public APIs before picking one.

So we decided to do it ourselves. Say hi to apirank.dev.

Why ranking public APIs is hard?

Automating Public API technical assessment is a tough problem. First, we needed to find all the public APIs and their specifications - mostly OpenAPI files.

We used several strategies to find those:

- Crawl API repositories like apis.guru

- Crawl Github for openapi.json and openapi.yaml files

- A cool google dork

Those strategies enabled us to gather around ~20.000 OpenAPI specs.

Then lies the hard part of the problem:

We want to dynamically evaluate those APIs' security, performance, and reliability.

But APIs take parameters that are tightly coupled to the underlying business logic.

A naive automated way would not work: putting random data in parameters would likely not pass the API's validation layer, thus giving us little insight into the real API behavior.

Manually creating tests for each API is also not sustainable: it would take years for our 10-people team. We needed to do it in an automated way.

Fortunately, our main R&D efforts at Escape aimed to generate legitimate traffic against any API efficently.

That's how we developed Feedback-Driven API exploration, a new technique that quickly asses the underlying business logic of an API by analyzing responses and dependencies between requests. (see https://escape.tech/blog/feedback-driven-api-exploration/)

We originally developed this technology for advanced API security testing. But from there, it was super easy to also test the performance and the reliability of APIs.

How we ranked APIs?

Now that we have a scalable way to gather exciting data from public APIs, we need to find a way to rank them. And this ranking should be meaningful to developers when choosing their APIs.

We decided to rank APIs using the following five criteria:

- Security - Performance - Reliability - Design - Popularity

Security score is computed as a combination of the number of OWASP top 10 vulnerabilities, and the number of sensitive information leaks detected by our scanner

The performance score is derived from the median response time of the API, aka the P50

The reliability score is derived from the number of inconsistent server responses, either 500 errors or responses that are not conform with the specification

The Design score reflects the quality of the OpenAPI specification file. Having comments, examples, a license, and contact information improves this score

The popularity score is computed from the number of references to the API found online

If you are curious about your API's performance, you can ask us to index your own api for free at https://apirank.dev/submit


Comments URL: https://news.ycombinator.com/item?id=35084108

Points: 30

# Comments: 9



from Hacker News: Front Page https://apirank.dev/
Continue Reading

Hi HN, we're excited to share our open source tool with the community! We previously posted here with the tagline “real-time events for Postgres” [0]. But after feedback from early users and the community, we’ve shifted our focus to working on tooling for manual database changes.

We've consistently heard teams describe challenges with the way manual data updates are handled. Seemingly every engineer we spoke with had examples of errant queries that ended up causing significant harm in production environments (data loss/service interruptions).

We’ve seen a few different approaches to how changes to production databases occur today:

Option 1: all engineers have production write access (highest speed, highest risk)

Option 2: one or a few engineers have write access (medium speed, high risk)

Option 3: engineers request temporary access to make changes (low speed, medium risk)

Option 4: all updates are checked into version control and run manually or through CI/CD (low speed, low risk)

Option 5: no manual updates are made - all changes must go through an internal endpoint (lowest speed, lowest risk)

Our goal is to enable high speed changes with the lowest risk possible. We’re planning to do this by providing an open-source toolkit for safeguarding databases, including the following features:

- Alerts (available now): Receive notifications any time a manual change occurs

- Audit History (beta): View all historical manual changes with context

- Query Preview (coming soon): Preview affected rows and query plan prior to running changes

- Approval Flow (coming soon): Require query review before a change can be run

We’re starting with alerts. Teams can receive Slack notifications anytime an INSERT, UPDATE, or DELETE is executed from a non-application database user. While this doesn’t prevent issues from occurring, it does enable an initial level of traceability and understanding who made an update, what data was changed, and when it occurred.

We’d love to hear feedback from the HN community on how you’ve seen database changes handled, pain points you’ve experienced with data change processes, or generally any feedback on our thinking and approach.

[0] https://news.ycombinator.com/item?id=34828169


Comments URL: https://news.ycombinator.com/item?id=35082508

Points: 13

# Comments: 1



from Hacker News: Front Page https://ift.tt/P8pDfRe
Continue Reading

Hi HN! A few days ago I saw a graph[0] that showed the # of job postings on HN was declining. I started wondering what other trends I could glean from the data, so I created this!

You can filter through the top level comments by keyword; for example you can filter by "remote" to see the massive spike around March 2020. Another interesting thing I found is that I can compare hiring across cities.

I hope you enjoy! I made it so that the links to your search are sharable so if you have some interesting data you should be able to just link the page you're on!

[0] https://rinzewind.org/blog-en/2023/the-tech-downturn-seen-th...


Comments URL: https://news.ycombinator.com/item?id=35057564

Points: 26

# Comments: 1



from Hacker News: Front Page https://ift.tt/kVbq8GC
Continue Reading

I've always wanted to just upload a whole book to ChatGPT and ask questions. Obviously with the char limit that's impossible... So some buddies and I built Ghost. We have it limited to 5 pages for uploads for now, but plan on expanding the limit soon. Let me know what you guys think!


Comments URL: https://news.ycombinator.com/item?id=35059956

Points: 20

# Comments: 16



from Hacker News: Front Page https://ift.tt/dTzfRuX
Continue Reading

Eight Republican senators are pressing Director of National Intelligence Avril Haines to provide them with the raw materials that informed the intelligence community's latest assessment on the origins of Covid-19, according to a letter they sent Haines on Monday.

from CNN.com - RSS Channel https://ift.tt/neJ9p8Q
Continue Reading

Many antibody drugs are no longer effective against Covid-19 due to the rapid evolution of the virus and its many subvariants, writes Syra Madad. The US government desperately needs to continue to invest in more Covid therapeutics to keep up with the evolving nature of the virus.

from CNN.com - RSS Channel https://ift.tt/GrQhisc
Continue Reading

The murder trial of disgraced South Carolina attorney Alex Murdaugh ended March 2 after nearly six weeks at the Colleton County Courthouse in Walterboro, a small town about 40 miles west of Charleston. Murdaugh was found guilty of killing his wife and son, who were shot to death in 2021 at the family's Islandton property, known as Moselle.

from CNN.com - RSS Channel https://ift.tt/2lz1XMt
Continue Reading

An official with a leading human rights group says the Premier League will need to "re-examine the assurances made" about Saudi Arabian state involvement in Newcastle United after a court filing in the US named the club's chairman as "a sitting minister of the Saudi government."

from CNN.com - RSS Channel https://ift.tt/Spsaxcr
Continue Reading

Political commentator and comedian Bill Maher sits down with CNN's Jake Tapper in an exclusive one-on-one interview to discuss topics including the debate around transgender issues, his 2024 presidential predicton, the evolving idea of "wokeness" and more.

from CNN.com - RSS Channel https://ift.tt/R5doILs
Continue Reading