Popular Posts

Popular Content

Powered by Blogger.

Search This Blog

Blog Archive

Follow on Google+

Recent Posts

About us

Hey HN, we’ve been experimenting a lot with MCP servers lately, and one of the most time-consuming challenges has been connecting MCP clients to remote MCP servers. To solve this, we built a library that generates them on the fly, enabling 1-click installation buttons and links for most clients out there.

Feel free to try out the generator and use it to improve the README of your remote MCP server with the generated markdown. You can even configure the library to return HTML instructions if someone accesses your remote MCP server via the web.


Comments URL: https://news.ycombinator.com/item?id=45250200

Points: 7

# Comments: 0



from Hacker News: Front Page https://ift.tt/aQginH3
Continue Reading

Hey HN, we're David and Amanda from Recall.ai (https://www.recall.ai). Today we’re launching our Desktop Recording SDK, a way to get meeting data without a bot in the meeting: https://www.recall.ai/product/desktop-recording-sdk. It’s our biggest release in quite a while so we thought we’d finally do our Launch HN :)

Here’s a demo that shows it producing a transcript from a meeting, followed by examples in code: https://www.youtube.com/watch?v=4croAGGiKTA . API docs are at https://docs.recall.ai/.

Back in W20, our first product was an API that lets you send a bot participant into a meeting. This gives developers access to audio/video streams and other data in the meeting. Today, this API powers most of the meeting recording products on the market.

Recently, meeting recording through a desktop form factor instead of a bot has become popular. Many products like Notion and ChatGPT have added desktop recording functionality, and LLMs have made it easier to work with unstructured transcripts. But it’s actually hard to reliably record meetings at scale with a desktop app, and most developers who want to add recording functionality don’t want to build all this infrastructure.

Doing a basic recording with just the microphone and system audio is fairly straightforward since you can just use the system APIs. But it gets a lot harder when you want to capture speaker names, produce a video recording, get real-time data, or run this in production at large scale:

- Capturing speaker names involves using accessibility APIs to screen-scrape the video conference window to monitor who is speaking at what time. When video conferencing platforms change their UI, we must ship a change immediately, so this keeps working.

- Producing a video recording that is clean, and doesn’t capture the video conferencing platform UI involves detecting the participant tiles, cropping them out, and compositing them together into a clean video recording.

- Because the desktop recording code runs on end-user machines, we need to make it as efficient as possible. This means writing highly platform-optimized code, taking advantage of hardware encoders when available, and spending a lot of time doing profiling and performance testing.

Meeting recording has zero margin for failure because if anything breaks, you lose the data forever. Reliability is especially important, which dramatically increases the amount of engineering effort required.

Our Desktop Recording SDK takes care of all this and lets developers build meeting recording features into their desktop apps, so they can record both video conferences and in-person meetings without a bot.

We built Recall.ai because we experienced this problem ourselves. At our first startup, we built a tool for product managers that included a meeting recording feature. 70% of our engineering time was taken up by just this feature! We ended up starting Recall.ai to solve this instead. Since then, over 2000 companies use us to power their recording features, e.g. Hubspot for sales call recording, Clickup for their AI note taker. Our users are engineering teams building commercial products for financial services, telehealth, incident management, sales, interviewing, and more. We also power internal tooling for large enterprises.

Running this sort of infrastructure has led to unexpected technical challenges! For example, we had to debug a 1 in 36 million segfault in our audio encoder (https://www.recall.ai/blog/debugging-a-1-in-36-000-000-segfa...), we encountered a Postgres lock-up that only occurs when you have tens of thousands of concurrent writers (https://news.ycombinator.com/item?id=44490510), and we saved over $1M a year on AWS by optimizing the way we shuffle data around between our processes (https://news.ycombinator.com/item?id=42067275).

You can try it here: https://www.recall.ai. It's self-serve with $5 of free credits. Pricing starts at $0.70 for every hour of recording, prorated to the second. We offer volume discounts with scale.

All data recorded through Recall.ai is the property of our customers, we support 0-day retention, and we don’t train models on customer data.

We would love your feedback!


Comments URL: https://news.ycombinator.com/item?id=45199648

Points: 13

# Comments: 3



from Hacker News: Front Page https://ift.tt/fAI6kby
Continue Reading

TLDR: A small, vendor-agnostic inference loop that turns token logprobs/perplexity/entropy into an extra pass and reasoning for LLMs.

- Captures logprobs/top-k during generation, computes perplexity and token-level entropy.

- Triggers at most one refine when simple thresholds fire; passes a compact “uncertainty report” (uncertain tokens + top-k alts + local context) back to the model.

- In our tests on technical Q&A / math / code, a small model recovered much of “reasoning” quality at ~⅓ the cost while refining ~⅓ of outputs.

I kept seeing “reasoning” models behave like expensive black boxes. Meanwhile, standard inference already computes useful signals both before softmax normalization and after it(logprobs), which we usually throw away. This loop tries the simplest thing that you could think of: use those signals to decide when (and where) to think again.

GitHub (notebook + minimal code): https://github.com/monostate/weave-logprobs-reasoning-loop

Paper (short & engineer made): https://arxiv.org/abs/2509.00079

Blog (more context): https://monostate.ai/blog/entropy-refinement-blog

Requirements: Python, API that exposes logprobs (tested with OpenAI non reasoning 4.1). OPENAI_API_KEY and WEAVE for observability. Run the notebook; it prints metrics and shows which tokens triggered refinement.

- Python, simple loop (no retraining).

- Uses Responses API logprobs/top-k; metrics: perplexity, max token entropy, low-confidence counts.

- Weave for lightweight logging/observability (optional).

- Passing alternatives (not just “this looks uncertain”) prevents over-correction.

- A simple OR rule (ppl / max-entropy / low-confidence count) catches complementary failure modes.

- Numbers drift across vendors; keeping the method vendor-agnostic is better than chasing fragile pairings.

- Needs APIs that expose logprobs/top-k.

- Results are indicative—not a leaderboard; focus is on within-model gains (single-pass vs +loop).

- Thresholds might need light tuning per domain.

- One pass only; not a chain-of-thought replacement.

- Run it on your models and ideas (e.g., 4o-mini, v3, Llama variants with logprobs) and share logs in a PR for our README in GitHub if you'd like, PRs welcome - I’ll credit and link.

Overall let me know if you find making small models reason like this useful!


Comments URL: https://news.ycombinator.com/item?id=45118302

Points: 11

# Comments: 0



from Hacker News: Front Page https://ift.tt/Mb3fwoT
Continue Reading