Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[No Merge Until Feb 25] - FastRTC Release Post #2698

Merged
merged 13 commits into from
Feb 25, 2025

Conversation

freddyaboulton
Copy link
Contributor

@freddyaboulton freddyaboulton commented Feb 21, 2025

Congratulations! You've made it this far! Once merged, the article will appear at https://huggingface.co/blog. Official articles
require additional reviews. Alternatively, you can write a community article following the process here.

Preparing the Article

You're not quite done yet, though. Please make sure to follow this process (as documented here):

  • Add an entry to _blog.yml.
  • Add a thumbnail. There are no requirements here, but there is a template if it's helpful.
  • Check you use a short title and blog path.
  • Upload any additional assets (such as images) to the Documentation Images repo. This is to reduce bloat in the GitHub base repo when cloning and pulling. Try to have small images to avoid a slow or expensive user experience.
  • Add metadata (such as authors) to your md file. You can also specify guest or org for the authors.
  • Ensure the publication date is correct.
  • Preview the content. A quick way is to paste the markdown content in https://huggingface.co/new-blog. Do not click publish, this is just a way to do an early check.

Here is an example of a complete PR: #2382

Getting a Review

Please make sure to get a review from someone on your team or a co-author.
Once this is done and once all the steps above are completed, you should be able to merge.
There is no need for additional reviews if you and your co-authors are happy and meet all of the above.

Feel free to add @pcuenca as a reviewer if you want a final check. Keep in mind he'll be biased toward light reviews
(e.g., check for proper metadata) rather than content reviews unless explicitly asked.

@freddyaboulton freddyaboulton changed the title FastRTC Release Post [No Merge until Monday 24] - FastRTC Release Post Feb 22, 2025
fastrtc.md Outdated

# FastRTC: The Real-Time Communication Library for Python

In the last six months, the AI audio space has exploded with model releases (for both open and closed source models) and investor and developer interest. To name a few milestones:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the last six months, the AI audio space has exploded with model releases (for both open and closed source models) and investor and developer interest. To name a few milestones:
In the last few months, many new real-time speech models have been released and entire companies have been founded (around both open and closed source models). To name a few milestones:

fastrtc.md Outdated
In the last six months, the AI audio space has exploded with model releases (for both open and closed source models) and investor and developer interest. To name a few milestones:

- OpenAI and Google released their live multimodal APIs for ChatGPT and Gemini. OpenAI even went so far as to release a 1-800-ChatGPT phone number!
- Kyutai released Moshi, a fully open-source audio-to-audio LLM. Alibaba released Qwen2-Audio and Fixie.ai released Ultravox - two open-source LLMs that natively understand audio.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to models on Hub

fastrtc.md Outdated

- OpenAI and Google released their live multimodal APIs for ChatGPT and Gemini. OpenAI even went so far as to release a 1-800-ChatGPT phone number!
- Kyutai released Moshi, a fully open-source audio-to-audio LLM. Alibaba released Qwen2-Audio and Fixie.ai released Ultravox - two open-source LLMs that natively understand audio.
- EleveLabs raised $180m in their Series C.
Copy link
Member

@abidlabs abidlabs Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- EleveLabs raised $180m in their Series C.
- ElevenLabs <a href="https://elevenlabs.io/blog/series-c" target="_blank">raised $180m in their Series C</a>.

fastrtc.md Outdated
- Kyutai released Moshi, a fully open-source audio-to-audio LLM. Alibaba released Qwen2-Audio and Fixie.ai released Ultravox - two open-source LLMs that natively understand audio.
- EleveLabs raised $180m in their Series C.

Despite the explosion in the model and funding side, it's still difficult to build real-time AI applications, especially in Python.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Despite the explosion in the model and funding side, it's still difficult to build real-time AI applications, especially in Python.
Despite the explosion in the model and funding side, it's still difficult to build real-time AI applications that stream audio or video, especially in Python.

fastrtc.md Outdated

Despite the explosion in the model and funding side, it's still difficult to build real-time AI applications, especially in Python.

- ML engineers may not have experience with the technologies needed to build real-time applications.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- ML engineers may not have experience with the technologies needed to build real-time applications.
- ML engineers may not have experience with the technologies needed to build real-time applications, such as WebRTC.

fastrtc.md Outdated
- ML engineers may not have experience with the technologies needed to build real-time applications.
- Even code assistant tools like Cursor and Copilot struggle to write python code that supports real-time audio/video applications. I know from experience!

That's why we're excited to announce `FastRTC`, the real-time communication library for Python. The library is designed to make it super easy to build real-time audio and video AI applications entirely in Python! Let's dive in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would show an example code snippet here to illustrate how simple FastRTC is to use, before the Core Features.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it'll be more natural to talk about the Stream class

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed mentions of Stream and merged the core features with the introduction. I don't want them to go after code because they motivate what the code will show about fastrtc.

fastrtc.md Outdated

Let's break it down:
- The `ReplyOnPause` will handle the voice detection and turn taking for you. You just have to worry about the logic for responding to the user. Any generator that returns a tuple of audio, (represented as `(sample_rate, audio_data)`) will work.
- The `Stream` class will build a production-ready Gradio UI for you to quickly test out your stream (or deploy to prod!).
Copy link
Member

@abidlabs abidlabs Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "deploy to prod" part is not clear to me. I think I would add a third bullet point, something like, "once you have finished prototyping, you can deploy your Stream as a production-ready FastAPI app in a single line of code"

fastrtc.md Outdated
stream.ui.launch()
```

We're using the SambaNova API since it's fast. But you can use any LLM/text-to-speech/speech-to-text API. Bring the tools you love - `FastRTC` just handles the real-time communication layer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stt_model = get_stt_model()
tts_model = get_tts_model()

are doing some heavy lifting. I would mention what they are doing and that you can replace them with your own STT/TTS models are skip them altogether if you use a voice-to-voice model/api

@abidlabs
Copy link
Member

Left a few comments @freddyaboulton but otherwise this is looking great ⚡!

Copy link
Member

@pcuenca pcuenca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml / structure look fine to me, will go through the content later :)

fastrtc.md Outdated
Despite the explosion on the model and funding side, it's still difficult to build real-time AI applications that stream audio and video, especially in Python.

- ML engineers may not have experience with the technologies needed to build real-time applications, such as WebRTC.
- Even code assistant tools like Cursor and Copilot struggle to write python code that supports real-time audio/video applications. I know from experience!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Even code assistant tools like Cursor and Copilot struggle to write python code that supports real-time audio/video applications. I know from experience!
- Even code assistant tools like Cursor and Copilot struggle to write Python code that supports real-time audio/video applications. I know from experience!

Copy link
Member

@abidlabs abidlabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful, post looks great to me!

@freddyaboulton freddyaboulton changed the title [No Merge until Monday 24] - FastRTC Release Post FastRTC Release Post Feb 24, 2025
@freddyaboulton freddyaboulton changed the title FastRTC Release Post [No Merge Until Feb 25] - FastRTC Release Post Feb 24, 2025
@freddyaboulton freddyaboulton merged commit 982607d into huggingface:main Feb 25, 2025
1 check passed
@freddyaboulton freddyaboulton deleted the freddyaboulton/fastrtc branch February 25, 2025 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants