> For the complete documentation index, see [llms.txt](https://www.designreference.xyz/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://www.designreference.xyz/more/ai/tools.md).

# Tools

#### [ChatGPT](https://chatgpt.com/)

The original conversational AI launched in November 2022 by [OpenAI](https://openai.com/), which is led by Chief Executive Sam Altman. It is widely considered the industry benchmark for general-purpose intelligence and creative assistance, famously kicking off the modern AI boom.

ChatGPT has expanded beyond text with integrated image generation via DALL·E 3 and experimental multimodal capabilities. OpenAI has also previewed [Sora](https://sora.chatgpt.com/explore), a separate research model for AI video generation.

#### [Gemini](https://gemini.google.com/)

A powerful AI ecosystem developed by [Google DeepMind](https://deepmind.google/) and led by CEO Sundar Pichai. Originally launched as Bard in early 2023, it has since been rebranded and rebuilt on a native multimodal architecture, meaning it was designed from the ground up to understand text, images, video, and audio simultaneously.&#x20;

Gemini is uniquely distinguished by its deep integration into the Google ecosystem, allowing it to interact directly with Gmail, Docs, Drive, and Maps to summarise your personal files or plan trips in real-time. It features a massive context window that can process hours of video or thousands of pages of text in one go, significantly outperforming competitors in long-form data analysis.&#x20;

The platform includes Gemini Live for fluid, voice-based conversations on mobile, and Google's experimental Veo video generation system for high-definition video generation. It is the default intelligence across Android devices and Google Search, moving beyond a simple chatbot to become a proactive personal assistant capable of performing complex tasks like "auto-browsing" the web to book appointments or manage professional workflows autonomously.

#### [Claude](https://claude.ai/)

A leading large language model series developed by [Anthropic](https://www.anthropic.com/), a San Francisco-based company co-founded by siblings Dario and Daniela Amodei along with several former [OpenAI](https://openai.com/) researchers. Launched in early 2023, Claude focuses on "[Constitutional AI](https://constitutional.ai/)," which refers to "a set of techniques developed by Anthropic researchers to align AI systems with human values to make them helpful, harmless, and honest. The key ideas behind Constitutional AI are aligning an AI's behaviour with a 'constitution' defined by human principles, using techniques like self-supervision and adversarial training, developing constrained optimisation techniques, and designing training data and model architecture to encode beneficial behaviours."

It is widely regarded as the most sophisticated tool for nuanced writing, complex reasoning, and long-form document analysis, often preferred by professionals for its natural, human-like conversational tone and lower rate of factual errors. The current flagship family is world-class at coding and technical tasks, featuring a massive context window that allows it to process and remember thousands of pages of information or entire codebases in a single session.&#x20;

While it lacks the native image and video generation tools of its competitors, it excels in productivity features like Artifacts for side-by-side content creation and advanced "computer use" capabilities that allow it to navigate and operate software interfaces to complete multi-step tasks autonomously.

<details>

<summary>Video: "Claude COWORK Clearly Explained (&#x26; how to use it for beginners)" by Eliot Prince</summary>

{% embed url="<https://www.youtube.com/watch?v=ZeWfksNXlbU>" %}

</details>

#### [Grok](https://grok.com/)

A conversational AI developed by [xAI](https://x.ai/), the artificial intelligence company founded by Elon Musk in 2023. Grok was designed as a more irreverent alternative to traditional AI assistants, with a tone inspired by The Hitchhiker’s Guide to the Galaxy and a willingness to engage with unconventional or humorous prompts.

One of Grok's defining advantages is its integration with the X social platform, which allows it to access and summarise trending conversations and breaking news in near real time. This connection enables the model to analyse public posts and surface insights about rapidly evolving events in a way that traditional static language models cannot.

Grok is also a multimodal system capable of handling text, images, and coding tasks, and it is available directly within the X interface as well as through dedicated apps. While it shares many core capabilities with other leading assistants, its positioning emphasises real-time information, a more playful personality, and tight integration with the broader X ecosystem.

Grok also sits within Elon Musk's broader technology ecosystem. Musk has suggested the model could eventually power conversational intelligence across products from [Tesla](https://www.tesla.com/), including in-car assistants and potentially the [Optimus](https://en.wikipedia.org/wiki/Optimus_\(robot\)), although these integrations have not yet been widely deployed. Following the integration of xAI with SpaceX in 2026, there has also been discussion about combining large-scale AI with space infrastructure such as the [Starlink](https://starlink.com/) network and potential orbital data centres, positioning Grok as a possible intelligence layer across Musk's expanding technology platform.

#### [CoPilot](https://copilot.microsoft.com/)

The enterprise-grade AI assistant developed by [Microsoft](https://www.microsoft.com/) under the leadership of Chairman and CEO Satya Nadella. Launched in early 2023 as Bing Chat and later unified under the Copilot brand, it is built on a deep, multi-billion partnership with [OpenAI](https://openai.com/), primarily utilising OpenAI models.

Copilot is uniquely defined by its "system of agents" architecture, which allows it to function as a seamless layer across the entire [Microsoft 365](https://www.office.com/) ecosystem, including Word, Excel, PowerPoint, and Teams. It excels at grounded productivity, meaning it can pull real-time data from your emails, calendar, and SharePoint files to draft documents or summarise meetings with specific internal context.

The platform has expanded into [Copilot Studio](https://adoption.microsoft.com/en-us/ai-agents/copilot-studio/), which enables users to build their own custom AI agents focusing on making the AI a more proactive, personalised companion with a long-term memory of user preferences. While it leverages OpenAI's core technology for reasoning and image generation via DALL-E, Microsoft has recently added model flexibility, allowing enterprise users to choose other "brains" like Claude for specific research tasks within the Copilot interface.

#### [Perplexity](https://www.perplexity.ai/)

An AI-powered "answer engine" founded in 2022 by Aravind Srinivas, Denis Yarats, Johnny Ho, and Andy Konwinski. Unlike traditional chatbots that rely solely on pre-trained data, Perplexity is designed specifically for research and fact-finding by browsing the live internet to provide real-time, sourced answers. Every claim it makes is backed by clickable inline citations, making it a favourite for academics and professionals who need to verify information and avoid the "hallucinations" common in other models.

While it has its own internal models called Sonar, a key differentiator is that Pro users can choose to power their searches with other top-tier "brains". The platform has expanded into a full productivity ecosystem featuring the [Comet](https://www.perplexity.ai/comet) AI-native browser and Deep Research mode, which can autonomously perform multi-step investigations across dozens of websites to produce comprehensive reports. It has moved away from traditional advertising to focus on a subscription and enterprise-first model, positioning itself as a transparent, high-trust alternative to legacy search engines.

#### [Midjourney](https://www.midjourney.com/home)

An independent generative artificial intelligence service focused on high-quality image synthesis, founded by David Holz and operated as a relatively small, self-funded research lab. Since its open beta launch in July 2022, Midjourney has established itself as the leading AI tool for artists, graphic designers, and creative professionals due to its unparalleled aesthetic quality, stylistic consistency, and unique "painterly" or highly detailed textures.

Unlike competitors that often generate photorealistic imagery, Midjourney is famous for its distinctive, artistic interpretation of user prompts. Midjourney v7.2, includes advanced features for character consistency, style referencing across multiple images, the ability to generate short moving paintings or cinematic clips from still output, and sophisticated composition controls like panning and zooming.

While it originally operated almost exclusively through a Discord interface, the service now offers a powerful, simplified web interface that includes social sharing and creation features, though it remains a dedicated image generation specialist without the text analysis or reasoning capabilities of large language model assistants.

#### [Meta (Llama)](https://www.meta.ai/)

Meta AI, led by Mark Zuckerberg, is uniquely defined by its commitment to the "open-weights" movement, making its foundational Llama models available for public download and independent hosting. While competitors like OpenAI and Google keep their most advanced technology behind proprietary walls, Meta's strategy has turned Llama into the industry standard for developers and businesses who require full control, privacy, and customisation of their AI.

The Llama 4 family, including the Maverick and Scout models, introduced a sophisticated "mixture-of-experts" architecture that delivers world-class reasoning and multimodal capabilities with incredible efficiency. These models power the Meta AI assistant across Facebook, Instagram, and WhatsApp, which has become one of the most widely used AI tools globally.

Recently, Meta has pivoted toward "personal superintelligence," focusing on agentic systems that can perform complex, multi-step tasks across its ecosystem of apps. Meta is integrating Llama's intelligence into wearable hardware like the Ray-Ban Meta glasses and its Quest mixed-reality headsets to create seamless AI assistance in the physical world.

#### [Mistral](https://mistral.ai/)

Often called the "European Champion" of artificial intelligence, this Paris-based company was founded in April 2023 by former researchers from Meta and Google DeepMind, including CEO Arthur Mensch.

Mistral has gained international acclaim for its commitment to high-performance "open-weight" models, which provide businesses with the transparency and control to run powerful AI on their own private infrastructure. Their model, Mistral 3 Large, is a "mixture-of-experts" powerhouse that rivals the world's most advanced proprietary systems while remaining significantly more efficient to operate.

Their sophisticated multimodal assistant, [Le Chat](https://chat.mistral.ai/chat), features specialised "Focus" modes for deep web research, document analysis, and native "Flash Answers" for near-instant responses. Mistral is particularly popular with technical teams and European enterprises due to its strict adherence to data sovereignty and its specialised tools like [Devstral](https://mistral.ai/news/devstral) for agentic coding and [Voxtral](https://mistral.ai/news/voxtral) for real-time speech understanding.

#### [DeepSeek](https://www.deepseek.com/en/)

A disruptive Chinese AI powerhouse founded in 2023 by Liang Wenfeng, a quantitative hedge fund entrepreneur. DeepSeek has gained global fame for its radical architectural efficiency, proving that frontier-level intelligence can be achieved with a fraction of the budget and computing power used by Silicon Valley giants.

It is particularly celebrated in the developer and research communities for its "open-weight" approach, allowing users to download and run its models locally. The DeepSeek-V4 and R2 Pro models, are world-class in "reasoning" tasks, frequently matching or outperforming GPT-5 and Claude in complex mathematics, logic, and software engineering.

While it may lack some of the creative flair or multimodal features of its competitors, it is often the preferred engine for high-volume technical workloads and backend AI agents due to its extreme cost-effectiveness and speed.

#### [Firefly (Adobe)](https://www.adobe.com/ie/products/firefly/landpa.html)

A comprehensive suite of generative AI tools developed by Adobe under the leadership of CEO Shantanu Narayen and Chief Strategy Officer Scott Belsky. Unlike many competitors, Firefly is uniquely distinguished by its "commercially safe" training data, which focuses on Adobe Stock images, openly licensed content, and public domain material, ensuring that creators and enterprises can use generated assets without legal copyright concerns.

It is deeply integrated into the Creative Cloud ecosystem, powering popular features like Generative Fill in Photoshop and Text to Vector in Illustrator. Recently, the platform has expanded significantly beyond static images with the [Firefly Video Model](https://www.adobe.com/products/firefly/features/ai-video-generator.html), which allows for high-fidelity video generation and "generative extend" features in Premiere Pro.

Firefly Image Model 4 Ultra, provides industry-leading photorealism and precise control through "Structure" and "Style" reference tools. Uniquely, Adobe has adopted an "open ecosystem" approach, allowing users within Firefly to also access partner models like Google's Gemini Nano or OpenAI's Sora, positioning it as a professional hub for creative AI rather than just a standalone model.

#### [Runway](https://runwayml.com/)

A leading AI video research company co-founded in 2018 by Cristóbal Valenzuela, Alejandro Matamala, and Anastasis Germanidis. Originally a tool for simple video masking, it has evolved into a professional-grade creative suite that defined the "Gen-1" to "Gen-4" eras of AI cinema. The Gen-4 series, is famous for its "General World Models" which simulate realistic physics, lighting, and object permanence with incredible precision.

Runway is highly regarded by filmmakers for its granular controls, such as Motion Brush, which allows users to paint specific areas of an image to animate them, and its comprehensive "Video-to-Video" features that restyle existing footage while keeping original movements intact.

It remains the standard for high-end production, offering native audio generation and multi-shot editing tools that allow creators to build complex, consistent cinematic sequences entirely within a browser.

#### [Luma AI](https://lumalabs.ai/)

Originally a specialist in 3D capture and [NeRF technology](https://aws.amazon.com/what-is/neural-radiance-fields/), this San Francisco-based company transitioned into a major video powerhouse with the launch of its "Dream Machine" platform. Led by CEO Amit Jain, Luma is built on the vision of creating multimodal general intelligence that understands the 3D physical world.&#x20;

The Ray3 model is distinguished by its industry-leading "Modify" workflow, which allows directors to film real actors and then use AI to change their costumes, the environment, or the cinematography while perfectly preserving the actor's original performance and emotional delivery.&#x20;

Luma is often the preferred choice for projects requiring high-resolution 4K output and realistic human movement, featuring sophisticated "Character Reference" tools that lock a specific person's likeness across multiple different shots to ensure perfect visual continuity.

#### [Sora](https://sora.chatgpt.com/explore)

The high-end video generation model from [OpenAI](https://openai.com/), which famously stunned the industry with its first preview in early 2024. Now in its "Sora 2" iteration, it is fully integrated into the ChatGPT ecosystem and the dedicated Sora mobile app. Sora is unique for its "Disney-grade" visual quality and its massive training scale, which allows it to generate continuous, high-fidelity video clips up to 25 seconds long from a single prompt.&#x20;

It features native, perfectly synchronised audio generation and a "Cameo" system that allows users to insert licensed characters or consistent digital actors into any scene. While competitors focus on editing tools, Sora excels at "one-shot" creation, using a deep understanding of physical simulation to create videos that are often indistinguishable from real-world footage, supported by advanced "Extensions" that allow users to indefinitely continue a story while keeping the world and characters stable.

#### [Cursor](https://cursor.com/get-started)

A revolutionary "AI-native" code editor built as a fork of VS Code, developed by [Anysphere](https://anysphere.inc/). Unlike other coding assistants that function as simple plugins, Cursor is re-engineered from the ground up to place AI at the heart of the development workflow.&#x20;

It is widely considered the industry leader for "Agentic Coding," featuring a Composer mode that allows developers to describe complex features in natural language, such as "build me a login page with a dark mode toggle", after which the AI autonomously creates files, writes the code, and executes terminal commands to install dependencies.&#x20;

Cursor's primary advantage is its "Full Codebase Indexing," which gives the AI a complete understanding of every file and dependency in a project, allowing it to perform massive, multi-file refactors with a level of context and accuracy that standard extensions cannot match.

#### [GitHub Copilot](https://github.com/copilot)

The world's most widely adopted AI developer tool, owned by Microsoft and built on a foundation of [OpenAI](https://openai.com/)'s Codex and GPT-5 models. While it started as a simple "autocomplete" for code, it has evolved into a comprehensive "Copilot Plus" system that spans the entire software lifecycle from the terminal to the pull request. It is deeply integrated into the [GitHub](https://github.com/) platform, offering autonomous "Coding Agents" that can be assigned to fix bugs or implement features directly from a GitHub Issue.&#x20;

Copilot's "Agentic Memory" allows the AI to learn and remember a team's specific coding patterns, project standards, and internal documentation over time. It offers developers the flexibility to choose their underlying model, allowing them to switch between OpenAI, Claude, or Gemini depending on the complexity of the task at hand.

#### [Seedance](https://www.seedance.ai/)

A groundbreaking multimodal video generation system developed by [ByteDance](https://www.bytedance.com/en/), the parent company of [TikTok](https://www.tiktok.com/en/). Launched in early 2026, Seedance 2.0 disrupted the market with its "Dual-Branch Diffusion" architecture, which generates high-fidelity video and cinema-grade audio simultaneously rather than adding sound as an afterthought.&#x20;

It is specifically designed for "Narrative Intelligence," excelling at multi-shot storytelling where it can automatically plan and generate an entire edited scene with consistent characters and logical cuts from a single prompt. The model has become a lightning rod for controversy following viral deepfakes of major actors, it faced immediate cease-and-desist actions from studios like Disney and Paramount over its training data.&#x20;

It represents the cutting edge of "one-click" production, though it currently carries higher copyright risks for commercial projects compared to more "legally safe" tools like Adobe Firefly.

#### [ElevenLabs](https://elevenlabs.io/)

A specialist AI audio platform co-founded in 2022 by Piotr Dabkowski and Mati Staniszewski. It is the industry standard for high-fidelity synthetic speech, famous for its ability to clone voices and generate emotionally expressive narration that is indistinguishable from a human.&#x20;

The platform has evolved into an "Audio Infrastructure" giant, powering real-time translation for videos, AI-generated music, and advanced conversational voice agents for apps. It offers the ElevenLabs API, which allows for the integration of ultra-low latency, human-like dialogue into digital products, moving beyond robotic text-to-speech into the era of "Emotional Audio."

#### [Apple Intelligence](https://www.apple.com/ie/apple-intelligence/)

Apple's personal intelligence system is integrated across iOS, iPadOS, and macOS. Unlike standalone chatbots, Apple Intelligence is designed to be "context-aware," meaning it has permission to look at your calendar, emails, and files to assist with daily tasks.&#x20;

It features a completely overhauled Siri that can perform actions across different apps, such as "send the photos from last night's dinner to Sarah", and includes Writing Tools for proofreading and "Image Playground" for on-device graphic generation. It is distinguished by its "Private Cloud Compute" architecture, which processes complex AI tasks on dedicated Apple servers without ever storing or sharing the user's personal data.

#### [Pika](https://pika.art/)

A creative AI video platform founded by Demi Guo and Chenlin Meng that focuses on "directorial control." While other models excel at raw realism, Pika is famous for its "Pikaffects" suite, which allows designers to apply specific physics to objects, such as making an object melt, explode, or inflate.&#x20;

The Pika 2.5 engine is highly regarded for its temporal consistency and its intuitive web interface, making it a favourite for designers who need to create marketing assets or surrealist UI animations without a background in 3D modelling.

#### [Figma AI](https://www.figma.com/ai/)

A comprehensive suite of generative tools integrated directly into the industry-standard design platform, led by CEO Dylan Field. Rather than being a separate chatbot, Figma AI is woven into the design canvas to automate tedious tasks and accelerate the journey from idea to high-fidelity prototype.&#x20;

Key features include "Make Designs," which generates editable UI layouts from simple text prompts, and intelligent layer naming and organisation to keep files tidy. It has expanded into "Figma Sites," allowing designers to turn their canvas directly into live, responsive websites, and features advanced "Visual Search" to find components across massive enterprise libraries.&#x20;

It is uniquely positioned as the bridge between design and code, using its partnership with OpenAI to power "Dev Mode" features that help engineers translate visual designs into production-ready CSS and React code with perfect accuracy.

#### [Bolt](https://bolt.new/)

An AI-powered development environment created by StackBlitz that runs entirely in the web browser using WebContainer technology. It is designed for "full-stack prompt-to-app" creation, allowing users to describe a web application and watch as Bolt sets up the file structure, installs dependencies, and writes the code in real-time.&#x20;

Unlike many AI builders that only handle the frontend, Bolt is capable of setting up backends, databases, and authentication providers like Supabase or Clerk. It is a favourite for developers and technical product managers because it provides a full, interactive IDE (Integrated Development Environment) where you can manually edit the code alongside the AI. Its standout "diffs" feature makes it incredibly fast, as it only updates the specific lines of code that change during an iteration rather than rewriting entire files.

#### [Lovable](https://lovable.dev/)

A specialist "vibe coding" platform designed to help non-technical founders and product teams build and deploy full-stack web applications through natural language. While it shares some similarities with Bolt, Lovable focuses more on the "end-to-end" product journey, offering a highly polished chat interface that guides users through planning, database schema design, and deployment.

Its "Agent Mode" can autonomously browse the web for documentation or debug complex errors without human intervention. Lovable is deeply integrated with Supabase for data and GitHub for version control, ensuring that even though the app was built with AI, the user maintains full "code ownership" and can export the project to a professional development team at any time. It is widely considered the fastest way to go from a blank page to a revenue-generating MVP with integrated payments via Stripe.

#### [OpenClaw](https://openclaw.ai/)

OpenClaw is an open-source AI agent framework for orchestrating large language models and tools, including powerful browser automation capabilities for complex web workflows. It is self-hosted and model‑agnostic, giving teams control over infrastructure, data privacy, and provider choice while integrating with existing developer workflows and messaging interfaces.

OpenClaw agents can operate a dedicated, managed browser profile via the Chrome DevTools Protocol, enabling multi‑step, multi‑tab tasks such as automated QA flows, data collection from JavaScript‑heavy sites, and deployment operations triggered from chat. The platform logs agent context and tool usage for inspection, and its architecture combines LLM‑driven reasoning with external tools and scripts to improve reliability compared with brittle, selector‑based web scrapers.

<details>

<summary>Video: "OpenClaw: The Viral AI Agent that Broke the Internet - Peter Steinberger" by Lex Fridman</summary>

{% embed url="<https://www.youtube.com/watch?v=YFjfBk8HI5o>" %}

</details>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://www.designreference.xyz/more/ai/tools.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.