Mind-blowing AI events from Google and OpenAI

This week was notable for two big AI events from OpenAI and Google.

May 19, 2024

Hey hey, AI aficionados! Welcome to 'The AI Kitchen'—your go-to spot for all the juiciest tidbits and tastiest trends in the world of artificial intelligence. This week was notable for two big AI events from OpenAI and Google.

Here’s what happened this week:

Ilya Sutskever quits OpenAI
OpenAI launches the new model
Google's Gemini updates and Sora competitor

Let's dive in.

Ilya Sutskever quits OpenAI

OpenAI co-founder and chief scientist Ilya Sutskever has announced his departure from the company, a move that follows significant speculation about his role since the controversial ousting of Sam Altman in November 2023.

Highlights:

Leadership Confidence: Sutskever expressed his belief that OpenAI will continue to develop artificial general intelligence (AGI) that is both safe and beneficial under its current leadership.
Additional Departures: Alongside Sutskever, Jan Leike, co-lead of the superalignment group, is also leaving, with his resignation noted for its cryptic nature.
Team Changes: This news comes after several months of departures primarily from OpenAI’s superalignment and safety teams, sparking further speculation.
New Appointment: In the wake of these departures, OpenAI CEO Sam Altman has appointed Jakub Pachocki, a key figure in the development of GPT-4, as the new chief scientist.

Why it's important: The ongoing changes within OpenAI's leadership and safety teams mark the end of ongoing speculation about Sutskever's future with the company. The exit of such prominent figures raises questions about the direction of OpenAI's safety initiatives and makes the next moves of Sutskever and Leike, two of AI’s leading minds, particularly noteworthy.

TOGETHER WITH SMM AGENT

Grow your audience faster on LinkedIn. AI-powered tool to get inspiration, craft content, and leverage your LinkedIn game.

Get your AI assistant to grow your business on LinkedIn
Create a weekly content in 76 seconds
Boost your sales and build your audience

Try SMM Agent for free ->

OpenAI launches the new model

OpenAI has just launched GPT-4o, an advanced multimodal model that melds text, vision, and audio capabilities, establishing new standards in AI performance and introducing an array of innovative features.

Highlights:

Enhanced Multimodal Abilities: GPT-4o excels in processing text, images, audio, and code, significantly outperforming the previous GPT-4T model.
Cost and Efficiency Gains: This model is 50% cheaper to operate, offers five times the rate limits, and delivers twice the generation speed of its predecessors.
Mystery Solved: GPT-4o was also identified as the previously mysterious ‘im-also-a-good-gpt2-chatbot’ that appeared in the Lmsys Arena.

Voice and Other Upgrades:

Advanced Voice Features: The model can now respond in real-time, detect emotional cues, and integrate voice responses with text and visual data.
Practical Demonstrations: Demonstrations included real-time language translation, dual AI analysis of live video, and combined voice and visual aids for tasks like tutoring and coding.

Additional Developments:

Technological Advances: New capabilities include 3D generation, creating fonts, major enhancements in text generation within images, and synthesizing sound effects.
New ChatGPT Desktop App: A desktop application for macOS with a new user interface, making it easier to incorporate ChatGPT into daily computer tasks.

Free Access:

Wider Availability: For the first time, GPT-4o and its extensive features are accessible to all users for free, expanding its reach.

Why it's important: GPT-4o’s real-time, multimodal capabilities are transforming AI from a simple tool to an intelligent partner for collaboration, learning, and innovation. This upgrade not only shifts how users interact with AI but also promises the most significant enhancement for many who previously only had access to more basic models.

Full demo is here.

Google's Gemini updates and Sora competitor

Google just launched a slew of updates at its I/O Developer’s Conference, spotlighting major advancements across its AI ecosystem. These updates include significant enhancements to its Gemini model family and a new video generation model poised to compete with OpenAI’s Sora.

Highlights:

Gemini Model Enhancements: The 1.5 Pro version now boasts a massive 2 million token context window and improved capabilities in coding, logical reasoning, and image understanding. It can handle a diverse range of media types like documents, videos, audio, and codebases.
Speed and Efficiency: The newly introduced Gemini 1.5 Flash is designed for quick and efficient processing with a 1 million token context window.
Next-Gen Models: Google is also gearing up to launch Gemma 2 and a new vision-language model named PaliGemma in the coming weeks.
Personalized AI: For Gemini Advanced subscribers, the ability to create personalized AI ‘Gems’ from simple text descriptions will soon be available, mirroring functionality similar to ChatGPT’s GPTs.
Video and Image Innovations: Google unveiled Veo, a video model that can create videos over 60 seconds long in 1080p from various prompts, and the Imagen 3 model, which offers enhanced detail and language understanding.
Creator Tools: The VideoFX tool allows for storyboard-based video creation with added music, initially available in a private U.S. preview, while ImageFX with Imagen 3 is accessible through a waitlist.

Why it's important: The expansion of Gemini’s context window not only sets a new industry standard but also vastly broadens the potential applications for AI in processing extensive data. Moreover, with the introduction of Veo, Google is not just catching up but setting the stage for a head-to-head with OpenAI’s Sora, promising an exciting race to see which advanced video generation model reaches the public first.

Top-5 AI tools this week

Revid.ai - Create viral AI videos in minutes (link)
Fiski - The modern AI-automated global accounting (link)
Glato - Video ads from just a product link, with AI clones of real creators (link)
Glue - Smarter work chat for people, apps, and AI (link)
ShortlistIQ - AI recruiting to conduct first-round interviews (link)

AI images

Famous athletes if they changed sports (source)

AI video

RIP Rabbit R1 - showcase of ChatGPT-4o (source)

[Not an AI video] Street view is getting wild (source)

Weekly Thread

Weekly Midjourney Prompt

Award-winning photo of beautiful woman in neon dripping latex, sharp, Frank Cho, Walter Van Beirendonck

a black man sitting on top of a white cube, official jil sander editorial, beige and orange tones, trending on interfacelift, coat, artforum, inspired by Zhang Shuqi and Fear of God, yeezy, flagstones, tall and slender, peter morbacher style, wooden platforms, round-cropped, 2 0 1 4. modern attire, harpers bazaar, jeongseok lee

Creative food template. Topping fruit strawberry strawberries splashing dropping onto glass of milk milkshake yoghurt smoothie with liquid droplet splash on pink background. copy text space

Do you build an AI tool?

If you're using AI in your work or projects, I wanna hear all about it. Hit reply to this email and give me the deets. Who knows, maybe we'll feature your work in our AI newsletter. Keep building, and keep making AI awesome --sw 800 --ar 2:3

That's a wrap for today.

See you next week! If you want more tasty AI treats, be sure to follow our AI Chef on Twitter (link).

If you love this episode and want to support us, spread the word about us by sharing The AI Kitchen with friends. We really appreciate it!

Thank you for reading!

What'd you think of today's edition?

Help me to understand what you think about this episode. Just reply with a number (1, 2, or 3) to this email.

1 - Damn good
2 - Meh, do better
3 - You didn't cook it

Your compadre,

Anton "AI Chef" Cherkasov

The LinkedIn Secrets

Discussion about this post

Ready for more?