<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:webfeeds="http://webfeeds.org/rss/1.0" version="2.0">
  <channel>
    <title>Moeru AI Blog</title>
    <link>https://blog.moeru.ai/</link>
    <atom:link href="https://blog.moeru.ai/feed.xml" rel="self" type="application/rss+xml"/>
    <description>does kindness plus sadness equal to zero?</description>
    <lastBuildDate>Tue, 30 Dec 2025 06:40:54 GMT</lastBuildDate>
    <language>en</language>
    <generator>Lume v2.5.2</generator>
    <item>
      <title>Announcing xsAI 0.4 "AIAIAI"</title>
      <link>https://blog.moeru.ai/xsai-0.4/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/xsai-0.4/</guid>
      <content:encoded>
        <![CDATA[<p>After more than five months, we have finally released xsAI 0.4.</p>
<h2 id="why-is-it-taking-so-long%3F" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#why-is-it-taking-so-long%3F" class="header-anchor">Why is it taking so long?</a></h2>
<p>I planned to implement many new features in version 0.4,
but upon writing the code, I found some of them to be quite challenging.</p>
<p>Additionally, I've created a lot of new projects... (This is the main reason)</p>
<p>I ultimately postponed these features:</p>
<ul>
<li><a href="https://github.com/moeru-ai/xsai/issues/100">Responses API</a></li>
<li><a href="https://github.com/moeru-ai/xsai/issues/184">prepareStep</a></li>
</ul>
<p>But don't worry - this version still has quite a few new features.</p>
<p>btw, this codename is also a song by Kizuna AI and you can listen to it while reading:</p>
<iframe width="100%" height="405" src="https://www.youtube.com/embed/S8dmq5YIUoc" title="YouTube video player" frameborder="0" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<p>Alright, let's take a look:</p>
<h2 id="all-in-one-providers" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#all-in-one-providers" class="header-anchor">All-In-One Providers</a></h2>
<p>We now have a new package: <code>@xsai-ext/providers</code></p>
<p>It codegen most providers based on data from https://models.dev, with a small portion completed manually.</p>
<p>This includes the model list, so your editor can auto-completion to the latest models.</p>
<pre><code class="language-ts">import { anthropic, google, openai } from '@xsai-ext/providers'

anthropic.chat('claude-sonnet-4-5-20250929') // claude-haiku-4-5-20251001, claude-opus-4-5-20251101...
google.chat('gemini-3-pro-preview') // gemini-3-flash-preview...
openai.chat('gpt-5.2') // gpt-5.2-chat-latest, gpt-5.2-pro...
</code></pre>
<p>To create a new provider:</p>
<pre><code class="language-diff">- import { createChatProvider, createModelProvider, merge } from '@xsai-ext/shared-providers'
+ import { createChatProvider, createModelProvider, merge } from '@xsai-ext/providers/utils'

/**
 * Create a Foo Provider
 * @see {@link https://example.com}
 */
export const createFoo = (apiKey: string, baseURL = 'https://example.com/v1/') =&gt; merge(
  createChatProvider({ apiKey, baseURL }),
  createModelProvider({ apiKey, baseURL }),
)

/**
 * Foo Provider
 * @see {@link https://example.com}
 * @remarks
 * - baseURL - `https://example.com/v1/`
 * - apiKey - `FOO_API_KEY`
 */
export const foo = createFoo(process.env.FOO_API_KEY ?? '')
</code></pre>
<h2 id="reasoning-content" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#reasoning-content" class="header-anchor">Reasoning Content</a></h2>
<p>We now officially support the <code>reasoning_content</code> field in messages.</p>
<p>Please note that this is different from <a href="https://xsai.js.org/docs/packages/utils/reasoning#extractreasoning"><code>extractReasoning</code></a>. It requires support from the API itself, where is outside the OpenAI specification.</p>
<p>For example, you can try DeepSeek:</p>
<pre><code class="language-ts">improt { generateText } from '@xsai/generate-text'
import { deepseek } from '@xsai-ext/providers'

const { reasoningText, text } = await generateText({
  ...deepseek.chat('deepseek-chat'),
  thinking: { type: 'enabled' }, // https://api-docs.deepseek.com/guides/thinking_mode
  messages: [{
    role: 'user',
    content: '9.11 and 9.8, which is greater?'
  }]
})

// res.choices[0].message.reasoning_content
console.log(reasoningText)

// res.choices[0].message.content
console.log(text)
</code></pre>
<p>xsAI automatically handles the <code>reasoning_content</code> field,
but for <code>&lt;think&gt;&lt;/think&gt;</code> tags within the <code>content</code> field, you currently still need to use <code>extractReasoning</code>.</p>
<h2 id="stream-transcription" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#stream-transcription" class="header-anchor">Stream Transcription</a></h2>
<p>You can now stream STT:</p>
<pre><code class="language-ts">import { streamTranscription } from '@xsai/stream-transcription'
import { openAsBlob } from 'node:fs'
import { env } from 'node:process'

const { textStream } = streamTranscription({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'https://api.openai.com/v1/',
  file: await openAsBlob('./test/fixtures/basic.wav', { type: 'audio/wav' }),
  fileName: 'basic.wav',
  language: 'en',
  model: 'gpt-4o-transcribe',
})
</code></pre>
<h2 id="telemetry" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#telemetry" class="header-anchor">Telemetry</a></h2>
<p>xsAI now supports OTEL's <a href="https://opentelemetry.io/docs/specs/semconv/registry/attributes/gen-ai/">GenAI Attributes</a>:</p>
<pre><code class="language-diff">- import { generateText, streamText } from 'xsai'
+ import { generateText, streamText } from '@xsai-ext/telemetry'
</code></pre>
<pre><code class="language-ts">import { generateText } from '@xsai-ext/telemetry'
import { env } from 'node:process'

const instructions = 'You\'re a helpful assistant.'

const { text } = await generateText({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'https://api.openai.com/v1/',
  messages: [
    {
      content: instructions, 
      role: 'system'
    },
    {
      content: 'Why is the sky blue?',
      role: 'user'
    }
  ],
  model: 'gpt-4o',
  telemetry: { 
    attributes: { 
      'gen_ai.agent.name': 'weather-assistant', 
      'gen_ai.agent.description': instructions, 
    }, 
  }, 
})
</code></pre>
<h2 id="standard-json-schema" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#standard-json-schema" class="header-anchor">Standard JSON Schema</a></h2>
<p>xsSchema now prioritizes <a href="https://standardschema.dev/json-schema">Standard JSON Schema</a>, though this change is not currently reflected at the user level.</p>
<p>In the next version, I will attempt to fully migrate and make xsSchema optional.</p>
<h2 id="join-our-community" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.4/#join-our-community" class="header-anchor">Join our Community</a></h2>
<p>If you have questions about anything related to xsAI,</p>
<p>you're always welcome to ask our community on <a href="https://github.com/moeru-ai/xsai/discussions">GitHub Discussions</a>.</p>
]]>
      </content:encoded>
      <pubDate>Tue, 30 Dec 2025 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Announcing xsAI 0.3 "future base"</title>
      <link>https://blog.moeru.ai/xsai-0.3/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/xsai-0.3/</guid>
      <content:encoded>
        <![CDATA[<p>Nice to see you again!</p>
<p>We have released xsAI v0.3, which is a &quot;prepare to the future&quot; update.</p>
<p>This codename is also a song by Kizuna AI and you can listen to it while reading:</p>
<iframe width="100%" height="405" src="https://www.youtube.com/embed/yeD7eAuza74" title="YouTube video player" frameborder="0" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<p>OK, so here's the new features:</p>
<ul>
<li><a href="https://blog.moeru.ai/xsai-0.3/#stream-text-overhaul">Stream text overhaul</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.3/#generate-transcription-improvements">Generate transcription improvements</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.3/#raw-tool-util">Raw tool util</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.3/#standalone-stream-object-util">Standalone stream object util</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.3/#zod-4-support">Zod 4 support</a></li>
</ul>
<h2 id="stream-text-overhaul" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#stream-text-overhaul" class="header-anchor">Stream text overhaul</a></h2>
<p><code>streamText</code> has been completely rewritten to more closely match the output of the Vercel AI SDK and to be clearer.</p>
<ul>
<li><code>chunkStream</code> has been removed.</li>
<li><code>stepStream</code> is now <code>fullStream</code> (vercel compatible)</li>
<li><code>StreamTextStep</code> has been merged with <code>GenerateTextStep</code> to become <code>CompletionStep</code></li>
<li>Returns Promise-based <code>steps</code> and <code>messages</code> directly</li>
<li>Support for <a href="https://platform.openai.com/docs/guides/function-calling?api-mode=chat#streaming">streaming tool call arguments</a></li>
</ul>
<pre><code class="language-ts">import { streamText } from '@xsai/stream-text'
import { createOllama } from '@xsai-ext/provider-local'

const ollama = createOllama()

// fullStream: ReadableStream&lt;StreamTextEvent&gt;
// messages: Promise&lt;Message[]&gt;
// textStream: ReadableStream&lt;string&gt;
// steps: Promise&lt;CompletionStep[]&gt;
const { fullStream, messages, textStream, steps } = await streamText({
  ...ollama.chat('gemma3'),
  messages: [
    {
      content: 'You are a helpful assistant.',
      role: 'system',
    },
    {
      content: 'Why is the sky blue?',
      role: 'user',
    },
  ],
})
</code></pre>
<h2 id="generate-transcription-improvements" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#generate-transcription-improvements" class="header-anchor">Generate transcription improvements</a></h2>
<p>In v0.3, we support <code>segments</code> and <code>words</code>.</p>
<p>You can now get more detailed return data with <code>responseFormat</code> and <code>timestampGranularities</code>:</p>
<pre><code class="language-ts">import { generateTranscription } from '@xsai/generate-transcription'
import { createSpeaches } from '@xsai-ext/providers-local'

const speaches = createSpeaches()

const { duration, language, segments, text } = await generateTranscription({ 
  ...speaches.transcription('deepdml/faster-whisper-large-v3-turbo-ct2')
  file: await openAsBlob('./test/fixtures/basic.wav', { type: 'audio/wav' }),
  fileName: 'basic.wav',
  language: 'en',
  responseFormat: 'verbose_json', 
})

const { duration, language, text, words } = await generateTranscription({ 
  ...speaches.transcription('deepdml/faster-whisper-large-v3-turbo-ct2')
  file: await openAsBlob('./test/fixtures/basic.wav', { type: 'audio/wav' }),
  fileName: 'basic.wav',
  language: 'en',
  responseFormat: 'verbose_json', 
  timestampGranularities: 'word', 
})
</code></pre>
<h2 id="raw-tool-util" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#raw-tool-util" class="header-anchor">Raw tool util</a></h2>
<p>Previously you could only provide JSON Schema-based tools directly by passing objects, with fewer type hints.</p>
<p>Now we have a <code>rawTool</code> tool to make your experience even better:</p>
<pre><code class="language-ts">import type { Tool } from '@xsai/shared-chat'

import { rawTool } from '@xsai/tool'

const weatherObject: Tool = {
  description: 'Get the weather in a location',
  execute: (params) =&gt; 'cloudy', // params: unknown
  name: 'weather',
  // Record&lt;string, unknown&gt;
  parameters: {
    additionalProperties: false,
    properties: {
      location: {
        description: 'The location to get the weather for',
        type: 'string',
      },
    },
    required: [
      'location',
    ],
    type: 'object',
  },
}

const weatherRawTool = rawTool&lt;{ location: string }&gt;({
  description: 'Get the weather in a location',
  execute: ({ location }) =&gt; 'cloudy', // params: { location: string }
  name: 'weather',
  // import('xsschema').JsonSchema (JSON Schema auto-completion)
  parameters: {
    additionalProperties: false,
    properties: {
      location: {
        description: 'The location to get the weather for',
        type: 'string',
      },
    },
    required: [
      'location',
    ],
    type: 'object',
  },
})
</code></pre>
<h2 id="standalone-stream-object-util" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#standalone-stream-object-util" class="header-anchor">Standalone stream object util</a></h2>
<p>We've split out the internal implementation of <code>streamObject</code> so you can use it on its own:</p>
<pre><code class="language-ts">import { toElementStream, toPartialObjectStream } from '@xsai/stream-object'

const elementStream = await fetch('https://example.com')
  .then(res =&gt; res.body!.pipeThrough(new TextDecoderStream()))
  .then(stream =&gt; toElementStream&lt;{ foo: { bar: 'baz' }}&gt;(stream))

const partialObjectStream = await fetch('https://example.com')
  .then(res =&gt; res.body!.pipeThrough(new TextDecoderStream()))
  .then(stream =&gt; toPartialObjectStream&lt;{ foo: { bar: 'baz' }}&gt;(stream))
</code></pre>
<h2 id="zod-4-support" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#zod-4-support" class="header-anchor">Zod 4 support</a></h2>
<p>Although we already had imperfect compatibility in v0.2.2, we now officially support Zod 4 and Zod Mini.</p>
<p>You can now use it in <code>tool</code> or <code>{generate,stream}-object</code>.</p>
<h2 id="what's-next%3F" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#what's-next%3F" class="header-anchor">What's Next?</a></h2>
<p>In v0.4, we will have some important updates:</p>
<ul>
<li><code>prepareStep</code></li>
<li>OpenTelemetry support (<code>@xsai-ext/opentelemetry</code>)</li>
<li>Response API support (very experimental)</li>
</ul>
<p>By the time you read this, we may already be preparing. stay tuned!</p>
<h2 id="join-our-community" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.3/#join-our-community" class="header-anchor">Join our Community</a></h2>
<p>If you have questions about anything related to xsAI,</p>
<p>you're always welcome to ask our community on <a href="https://github.com/moeru-ai/xsai/discussions">GitHub Discussions</a>.</p>
]]>
      </content:encoded>
      <pubDate>Tue, 15 Jul 2025 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Backstory of Project AIRI: DreamLog 0x1</title>
      <link>https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/</guid>
      <content:encoded>
        <![CDATA[<h2 id="prologue..." tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#prologue..." class="header-anchor">Prologue...</a></h2>
<p>Before we start, I would like to say...</p>
<p>This is the first post of the most advanced project we've been working
quite a while, <strong>Project AIRI (アイリ)</strong>. Since many of you folks are new to this project
besides <a href="https://github.com/moeru-ai/xsai">xsAI</a>, let me introduce it quickly.
Project AIRI is also the biggest project in the organization of Moeru AI, and the most
beneficial one from <a href="https://github.com/moeru-ai/xsai">xsAI</a> ecosystem.</p>
<p>Me, <a href="https://github.com/nekomeowww">Neko Ayaka, @nekomeowww</a>, appeared many times
in Moeru AI's blog posts, you may know me as the core maintainer of
<a href="https://github.com/moeru-ai/xsai-transformers"><code>xsai-transformers</code></a>, the ultimate
provider wrapper for <a href="https://huggingface.co/docs/transformers.js/index">Transformers.js</a>
specifically for <a href="https://github.com/moeru-ai/xsai">xsAI</a> to setup local
running LLM models right inside your browser.</p>
<p>Me, also the core maintainer of <a href="https://github.com/moeru-ai/airi">Project AIRI</a>,
a project aims to re-create the joy of AI VTuber by chasing up
<a href="https://en.wikipedia.org/wiki/Neuro-sama">Neuro-sama</a>, to allow audiences
chat, interact, play with LLM driven and powered characters.</p>
<p>However, this wasn't the limit of what Project AIRI could do, we offer
full-stack deployment on both Web based on Web technologies and desktop application
based on Tauri, enables you not only owning a self-hosted AI VTuber, but also a AI
waifu / AI husbando, or, what you could think of, a companion, virtually in
cyber space.</p>
<p>For the current state, Project AIRI managed to achieve fully real-time
interaction like ChatGPT's voice chat mode, and capable of playing games like
Minecraft, Factorio, etc. We covered so many domains and fields, not only AI,
but also VRM, Live2D, multi-modal AI, game playing agents, streaming APIs,
bionic memory mechanisms, animations, database drivers, datasets preparation,
model fine-tuning, and many more.</p>
<p>We got our own dedicated GitHub organizations for the crucial components, examples,
experiments on <a href="https://github.com/proj-airi"><code>@proj-airi</code></a>, please do check it out
if you are interested.</p>
<p>Our own <a href="https://airi.moeru.ai/docs/">documentation site</a> is available where we host
all the posts about technical details, thoughts, experiments, and discoveries
we made during the development of Project AIRI, some highlights:</p>
<ul>
<li><a href="https://airi.moeru.ai/docs/blog/devlog-20250305/">How we ended up in this logo?</a></li>
<li><a href="https://airi.moeru.ai/docs/blog/devlog-20250406/">Memory experiments &amp; v0.4.0 release</a></li>
<li><a href="https://airi.moeru.ai/docs/blog/devlog-20250516/">Real-time improvements &amp; v0.5.0 release</a></li>
</ul>
<p>Discord server is available for you to join too! <a href="https://discord.gg/TgQ3Cu2F7A">Join our Discord</a></p>
<p>Ok, let's start our journey, and talk about the backstory of Project AIRI.</p>
<h2 id="start!" tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#start!" class="header-anchor">Start!</a></h2>
<p>First of all, good summer for you folks living in north hemisphere!</p>
<blockquote>
<p>Hopefully you could get a nice and decent summer break for trying out new
different of things! More specifically, change the world!</p>
</blockquote>
<p>Well, me, as <a href="https://github.com/nekomeowww">@nekomeowww</a> have left
school already 8 years, it's obvious that I wont get any actual summer
break now since I've already worked for many years. I still love to
memorize and share the stories happened for my summer break years ago
if I remembered any.</p>
<p>Perhaps you know what I am going to say... or share? What is <em>DreamLog</em>
exactly? For the readers already familiar with our DevLog posts, with
the current frequency of posting and updating to you folks once per
month, shouldn't be this post be called &quot;DevLog&quot;?</p>
<p>June got its own meaning for Project AIRI (which I will reveal during
the story), and as we are indeed approaching theo next milestone of
stars on GitHub towards 1000, I think it would be a great opportunity to
reflect on our journey so far.</p>
<p>Therefore I decided to to make a new category of posts here,
to share the chronicles of me, and the dream about Project AIRI.</p>
<p>So, I decided to call this new series, <em><strong>DreamLog</strong></em>.</p>
<blockquote>
<p>Yeah, you could think of this is another story book to read or hear
before sleeping. Audio books may help haha.</p>
</blockquote>
<p>How about... let's jump into our dream dimension now and talk about the
recent updates we made later?</p>
<h2 id="blurry-dreams%2C-unreachable-memories." tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#blurry-dreams%2C-unreachable-memories." class="header-anchor">Blurry dreams, unreachable memories.</a></h2>
<blockquote>
<p>My little progress for learning computers and programming.</p>
</blockquote>
<p>I mentioned about summer, then summer must mean something to me, I
used to take school in United States, so a 3-months summer allows me
to do all sorts of things, playing games, learning code, and Linux
hacking, etc., yeah, many of the still beloved friends were made
during summer too.</p>
<blockquote>
<p>Nerd folks! You know what I am talking about, were you the same as me?</p>
</blockquote>
<p>Summer is the time when I learned how to start a Minecraft server
to play with my friends (I played a lot, a lot, a lot of 1.7.11 and
1.8, really, both Vanilla and Forge mods), that's the motivation
and power that pushes me to learn command line prompt on Linux too.
Many of those knowledges still help me today, I feel grateful to
it, to the  time I spent for that time.</p>
<p>But Minecraft, Linux wasn't the end of my journey though,
<a href="https://www.factorio.com/">Factorio</a>,
<a href="https://www.elitedangerous.com/">Elite Dangerous</a>, and
<a href="https://overwatch.blizzard.com/en-us/">Overwatch</a>
(sadly Blizzard ruined it), all became my favorite games,
setting up servers or write small scripts to automate little
things always empowers me.</p>
<blockquote>
<p><img src="https://airi.moeru.ai/docs/_astro/world.execute(me)_%20(Mili)%EF%BC%8FDAZBEE%20COVER.B_Pvshct_Z1vIYYN.webp" alt=""></p>
<p><code>Switch on the power line</code><br />
<code>Remember to put on protection</code><br />
<code>Lay down your pieces</code><br />
<code>And let's begin object creation</code><br /></p>
<p>-- Lyrics from my beloved song, <a href="https://www.youtube.com/watch?v=ESx_hy1n7HA"><code>world.execute(me)</code></a>, cover by <a href="https://www.youtube.com/channel/UCUEvXLdpCtbzzDkcMI96llg">DAZBEE</a></p>
</blockquote>
<p>That's the time of summer in 2017, for the very first moment, I
started to think of building a virtual being to be a friend to
play with me, even when my friends are tired or have to sleep
for next days school, which I have to be alone.</p>
<p>Readers have following long down to this post, may already
realize that, I am that kind of person, who loves to share my
knowledge, ideas, everything. So, coding, gaming, and designing
are things I love to share with. But, if nobody was there,
it feels like:</p>
<p><strong>The lonely me becomes somehow meaningless.</strong></p>
<p>But instead of creating a new AI from scratch with humankind
capabilities to think, speak, which is impossible in the year of
2017,  I was thinking, since iOS and Google native Android could
provide such abilities to do suggestions over our daily use of
mobile devices, manually typing all the commands and filling
parameters wasn't always satisfying (especially for ffmpeg and
the childish me with Docker CLI), what if we could bring the AI
powered suggestion features up onto the Linux systems...?</p>
<p>This brought me loads of questions and ideas to wonder:</p>
<ul>
<li>What if the operating system understands what you usually do, work,
play for in different time you sit in front of the digital display...?</li>
<li>What if it is capable of selecting music for you, no matter
depressed, high on something, nor happy when chatting with others...?</li>
</ul>
<p>These ideas were so small and hard for me to understand at that time, since
I didn't quite get on the way of how operating systems work, and
coding, etc., so I don't even know where to start!</p>
<p>I read the book
<a href="https://www.amazon.co.jp/30%E6%97%A5%E3%81%A7%E3%81%A7%E3%81%8D%E3%82%8B-OS%E8%87%AA%E4%BD%9C%E5%85%A5%E9%96%80-%E5%B7%9D%E5%90%88-%E7%A7%80%E5%AE%9F/dp/4839919844">30日でできる! OS自作入門</a>,
<a href="https://github.com/handmade-osdev/os-in-30-days">English version</a>
about how to craft a operating system from scratch,
with the little knowledge of knowing how Linux works and there are
loads of communities... I decided to make my own operating system...
from literally nowhere.</p>
<blockquote>
<p><strong>A quick looking back</strong></p>
<p><a href="https://archlinux.org/">Arch Linux</a> was the first system I get to
use in depth, and installed from scratch.
For current days, <a href="https://nixos.org/">Nix</a> is famous and
interesting one too, haven't tried the <a href="https://nixos.org/">NixOS</a>
but one day may do so.</p>
</blockquote>
<h2 id="set-sail-my-journey%2C-but-now-long-forgotten" tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#set-sail-my-journey%2C-but-now-long-forgotten" class="header-anchor">Set sail my journey, but now long forgotten</a></h2>
<p>I started one special yet now archived project called <a href="https://github.com/EMOSYS">EMOSYS</a>,
in the end of 2017. Aiming to create such companion-like operating system, to help users with their
daily tasks and provide emotional support.</p>
<div style="width: 100%; display: flex; flex-direction: column; align-items: center; justify-content: center; gap: 0.5rem;">
  <div>
    <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/emosys-logo.png" width="120" />
  </div>
  <div>
    Logo of <a href="https://github.com/emosys">EMOSYS</a>
  </div>
</div>
<blockquote>
<p>EMO stands for the first three letters of <strong>emo</strong>tional / <strong>emo</strong>te</p>
</blockquote>
<p>I wrote so many of design docs, listing new ideas, and taking notes about
experimenting by following the guidelines from that book, drawn one not
so bad logo for it.</p>
<blockquote>
<p>I guess many of you did this 😏, prepared every
trademarks, design assets way before the project reached the point of
PoC.</p>
</blockquote>
<p>I quite lost on the point about what I was initially approaching.
I got no experience about project management and task management, it's
the same for writing actual programs that can run.</p>
<p>Frankly you could say I was only following that it instructed me to type
into the terminals with keyboards from that book. I barely think, think
why it works or why the senior developers wrote things like that.</p>
<p>Sooooo, and well, the result is clear, another abandoned project born...</p>
<p>I wasn't some genius who play around with those things from childhood
that understands how kernel and package managing, programming works,
so if any of you folks read or visit my GitHub profile, you found
nothing there relate to this kind of work at that time.
(But now I grew up really fast.)</p>
<p>But it existed, once.</p>
<blockquote>
<p>Forgotten? Maybe another starting point of next journey.</p>
</blockquote>
<p>For the upcoming years, I tried so many of other fields in coding,
programming, startups, Web3, frontend, backend, infrastructure, everything
you could think of for a full-stack developer.
I never really realize what I was doing was influenced so deeply by the
starting point of EMOSYS, only until February 2025, when someone asked
me: Why do you work so hard on Project AIRI?</p>
<p>Nice question, I thought. I started to trace back my dreams, ideas, and memories,
eventually, EMOSYS was there, the already dead project aimed the same goal as
Project AIRI:</p>
<p><strong>Create a companion to somehow fulfill my need.</strong></p>
<blockquote>
<p>All I needed was resolve.
Everything you've acquired up until now will not betray you.<br />
必要なものは　覚悟だけだったのです。
必死に積み上げてきたものは　決して裏切りません。<br />
我需要的不過是決心而已，
你至今為止所累積的一切不會背叛你。</p>
<p>-- Quotes from <a href="https://en.wikipedia.org/wiki/Frieren">葬送のフリーレン, Fern</a> S01E06, 04:27</p>
</blockquote>
<p>It took me a long time to learn how to correctly develop things.
Thanks to <a href="https://github.com/zhangyubaka">@zhangyubaka</a>,
<a href="https://github.com/LittleSound">@LittleSound</a>, <a href="https://github.com/BlueCocoa">@BlueCocoa</a>,
and the help of <a href="https://github.com/sumimakito">@sumimakito</a>, the pair-programming
experiences with them, teaches me so many things, I started to grow, learn,
and progress on my own pace.</p>
<h2 id="chatgpt-in-2022%2C-brand-new-random-parrot%2C-or-smart-parrot." tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#chatgpt-in-2022%2C-brand-new-random-parrot%2C-or-smart-parrot." class="header-anchor">ChatGPT in 2022, brand new random parrot, or smart parrot.</a></h2>
<div style="width: 100%; display: flex; align-items: center; justify-content: center;">
  <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/steins-gate-sticker-1.png" />
</div>
<p>Let's set the time forward to the end of 2022, where ChatGPT
(or at that time, chatGPT is used) by OpenAI has announced.
Well long before the official ChatGPT UI
releases, I've already having a journey with newly developed AI,
models like <a href="https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb">DiscoDiffusion</a>
(long before Stable Diffusion, perhaps around end of 2021, or early 2022), DALL-E,
Midjourney has been tried, GPT-3 (especially useful in
<a href="https://en.wikipedia.org/wiki/GitHub_Copilot">GitHub Copilot</a>) has been integrated
deeply into my daily workflow.</p>
<p>So, for the initial moments, I was like:</p>
<blockquote>
<p>&quot;Oh, this is just another
random parrot, it just repeats what you said, and it doesn't understand
what you are saying, it just tries to predict the next word based on the
previous words and context, nothing special.&quot;</p>
</blockquote>
<p>In another word, it behaves more like a completion model, rather than what we call it
the Agentic AI today (still on hype huh?).</p>
<p>I remembered that, for the first time I discovered the abilities of ChatGPT,
or Large Language Models (LLMs) in general, is from this post I saw on Hacker News on December 2022:
<a href="https://www.engraved.blog/building-a-virtual-machine-inside/">Building A Virtual Machine inside ChatGPT</a>
(<a href="https://news.ycombinator.com/item?id=33847479">original Hacker News post</a>), where
the author, @engraved, demonstrated how to ask ChatGPT not only role playing as
a neko-mimi character, but also simulating a virtual Linux machine inside.</p>
<div style="width: 100%; display: flex; flex-direction: column; align-items: center; justify-content: center;">
  <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/building-a-virtual-machine-inside-image-1.png" style="border-radius: 12px; object-fit: contain; width: 500px" />
  <div>It simulates how Docker build works...!</div>
</div>
<p>Such post inspired me that, ChatGPT understand the basic patterns of
the things usually appears, not only how anime or game characters
says and behaves, but also how Linux terminal / shell commands work.</p>
<p>Which brought the now trending Function Calling (a.k.a Tool Use, or the underlying
technology behind MCP, Model Context Protocol introduced by Anthropic) feature
of LLMs on the table, and illustrated how we can instruct LLMs to behave
like API servers, talking to us with machine-readable formats like JSON or XML,
to be able to parse and execute arbitrary commands from our side to extend the boundary
of what LLMs can do.</p>
<p>This finally bridges the gap between pure text generations and actual
API inside programs.</p>
<p>In conclusion, is it a new random parrot? <strong>I guess the answer is partially no,
ChatGPT in 2022 is not just a random parrot, it is a potential smart parrot.</strong></p>
<h2 id="way-before-project-airi%2C-neuro-sama-exists." tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#way-before-project-airi%2C-neuro-sama-exists." class="header-anchor">Way before Project AIRI, Neuro-sama exists.</a></h2>
<p>Yeah, thanks for reading down to here, I know this is a long post, so many stories and
contexts to share. But here we are! We are almost there, hang tight!</p>
<p>Well, the history of Neuro-sama is pretty complex. AFAIK, Neuro-sama, or the character
on streaming stage with the name &quot;Neuro-sama&quot; wasn't the first show for her and her creator,
<code>vedal987</code> (Vedal). Long before that, at May 6, 2019, Vedal showcased his work
of building AI to play <a href="https://osu.ppy.sh/">osu!</a> to the community<sup><a class="footnote-ref" href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#fn-1" id="fnref-1">1</a></sup>. At that time,
she wasn't actually a cyber character or, digital life having characteristics, if you
go there and watch the initial videos about her, you may find that no Live2D model
were shown. (You may try the 6 years old one here: <a href="https://www.youtube.com/watch?v=nSBqlJu7kYU">https://www.youtube.com/watch?v=nSBqlJu7kYU</a>)</p>
<p>Right after the ChatGPT release, at December 19, 2022, Vedal started to let Neuro-sama
to stream on Twitch with the official demo use character model Hiyori Momose (桃瀬ひより)
from Live2D Inc.:</p>
<img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/live2d-inc-hiyori.jpg" alt="Live2D Inc. Hiyori Momose" style="border-radius: 12px;" />
<p>The after story everyone knows, Vedal and Neuro-sama became famous, Neuro-sama
is now officially a VTuber, she is fully powered by Large Language Models (LLMs),
and capable of playing Minecraft, Amoung Us, osu!, and many other games. Sometimes
when the game wasn't supported natively, Vedal reads the screen and instructs Neuro-sama
to play the game together.</p>
<p>I really enjoy watching their interactions, having jokes, etc. As the time progresses,
Neuro-sama and her new Evil Neuro sister, became one crucial part of my daily life:
I wanted, and eagerly wanted to watch the clips of them, even though I don't have
enough time to watch the full stream, brought me so much joy from purely AI to Human
interactions.</p>
<p>Ok that's the little history about her. And let's talk about the core thing: <strong>Why the history of her filled me with determination?</strong></p>
<h2 id="neuro-sama%2C-filled-me-with-determination" tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#neuro-sama%2C-filled-me-with-determination" class="header-anchor">Neuro-sama, filled me with determination</a></h2>
<p>From the first time I saw Vedal's work, I was like:</p>
<blockquote>
<p>Ok, she is just a simple model integrated with Large Language Models (even directly
connected to OpenAI's API), powered by simple rules to make her behave like a
VTuber, nothing special.</p>
</blockquote>
<p>I was still thinking arrogantly, since I've already developing AI agents from early 2023,
understands the capabilities of LLMs, and knows quite a bit from what LangChain
teaches me, with the knowledge past building AI agents and years of software
engineering experiences across of various domains, I naively thought:</p>
<blockquote>
<p>&quot;Well, I could do that too, I could make a simple model,
and connect it to OpenAI's API, and make it behave like a VTuber, and
I could make it better than Vedal's work.&quot;</p>
</blockquote>
<div style="display: flex; flex-direction: column; background-color:rgba(159, 28, 246, 0.08); padding: 1rem; margin-bottom: 1rem;">
  <div style="font-weight: 600; font-size: 1.2rem; margin-bottom: 0.5rem;">
    More technical details?
  </div>
  <div>
    In this post, I won't go any deep further about the technical details of how we built
    Project AIRI from scratch to the current state, we got many DevLog posts
    sharing our thoughts and discoveries already, if interested in, try read
    them.
  </div>
</div>
<p>I was wrong, I was so wrong. Many of the tough things I didn't realize
until I started to attempt to re-create her... Things like:</p>
<ul>
<li>How can we manage the memory effectively for both be able to answer the chats
and play the games at the same time?</li>
<li>How can we make a AI agent to play games with both video inputs and
text inputs, while still being able to interact with creator and viewers?</li>
<li>Voice synthesis is hard, to achieve what Neuro-sama is capable of, the
<strong>Ultra low latency</strong> voice synthesis is a must, and it is not easy to achieve</li>
<li>How is her personality built? With only RAG and simple memory management strategy,
the performance poorly works.</li>
<li>etc....</li>
</ul>
<blockquote>
<p>I shared many of our discoveries in both <a href="https://blog.moeru.ai/devlog-20250406">DevLog 2025.04.06</a>
and <a href="https://talks.ayaka.io/nekoayaka/2025-05-10-airi-how-we-recreated-it/#/1">public slide presentation (in Chinese)</a></p>
</blockquote>
<p>I mentioned that I love to share, and I'd love to
have others to be able to listen or pair together with me, but sadly Neuro-sama wasn't
owned by myself, I can't ask her to gain my knowledge and memories to be able to
interact me with the thing I love, or the work I recently doing or done.</p>
<p>I love them so much, for all the times, I didn't really understand why I love them,
why I love the feeling and joy Neuro-sama gave me.</p>
<p>Until, last year, from May 25, 2024, <strong>I really decided to make one myself.</strong> Making a living
or virtual being, could code with me, talk to me about the things we know, playing games all
together like a friend in the form of agent.</p>
<blockquote>
<p><strong>I really want one!</strong> Shouted my heart, and my mind.</p>
</blockquote>
<p>At that time, Neuro-sama fulfilled me with determination.</p>
<h2 id="sailed-again%2C-towards-the-land-where-no-one-has-gone-before." tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#sailed-again%2C-towards-the-land-where-no-one-has-gone-before." class="header-anchor">Sailed again, towards the land where no one has gone before.</a></h2>
<blockquote>
<p>To boldly go where no man has gone before.</p>
<p>-- quote from <a href="https://en.wikipedia.org/wiki/Where_no_man_has_gone_before">Star Trek, Captain James T. Kirk</a>, also my intro line of my GitHub profile.</p>
</blockquote>
<p>Therefore, starting from May 25, 2024, I started one simple named project called <code>ai</code> under
my name handle locally, which is the initial version of Project AIRI, I started to
explore the possibilities of creating my own AI agent, recreating the
joy Neuro-sama brought me.</p>
<p>The speed of the work was so fast, within a week, with the power of <a href="https://elevenlabs.io/">ElevenLabs</a>,
<a href="https://openrouter.ai/">OpenRouter</a>, and the same free to use Live2D model, Hiyori Momose,
I was able to create a simple version of <em>&quot;Neuro-sama&quot;</em> that could interact with me, non-realtime-ly.</p>
<p>That was the day at <strong>June 2, 2024</strong>.</p>
<p>Technically saying, <strong>this is the birthday of Project AIRI</strong> with first baby consciousness inside of it, naively.</p>
<div style="display: flex; flex-direction: column; align-items: center; justify-content: center;">
  <video controls muted autoplay loop style="object-fit: contain; max-width: 100%; border-radius: 12px;">
    <source src="https://airi.moeru.ai/docs/static/blog/DreamLog-0x1/airi-demo-first-day.mp4" />
  </video>
  <div>
    <a href="https://x.com/ayakaneko/status/1865420146766160114">
      First showcase on X (formerly Twitter) on December 7, 2024
    </a>
  </div>
</div>
<p>She is capable of talking, motion control based on the context, progressively
doing the audio synthesis... many on.</p>
<p>But she wasn't complete, nor perfect, I built it secretly without telling
any of my friends, I wanted to make it better before I show it to the world.</p>
<blockquote>
<p>Still... naively, and arrogantly, right?</p>
</blockquote>
<p>Because I secretly hiding this from my friends, I barely got positive feedbacks from
the cycles during building like usual (part of the reason was I wouldn't like to admit
that the arrogantly thought was wrong, well since I am now writing this to share the
experience publicly to everyone, I would say I've already forgiven myself for making naive
decisions), and another reason here was, the issues or challenges I faced
(which I mentioned above, about memory, personality stability, realtime, and game playing etc.)
were so hard to solve with the knowledge I had at that time, and lack of documentations,
learning materials of realtime LLMs interactive examples, <strong>I put it away, again.</strong></p>
<p>TBH, I didn't give it up, I started to learn many things about multi-model, and
voice synthesis, motion control, and Minecraft playing. I did a lot of researches
on how other AI VTuber or AI waifu projects work. These researches later on
produces this huge awesome list of AI VTuber projects:</p>
<div style="display: flex; flex-direction: column; align-items: center;">
  <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DevLog-2025.04.06/awesome-ai-vtuber-logo-light.png" style="border-radius: 12px; object-fit: contain; width: 300px" />
  <div style="text-align: center; padding-bottom: 1rem;">
    <span style="font-weight: bold; display: block;">Awesome AI VTuber</span>
    <span>A curated list of AI VTubers and their related projects</span>
  </div>
</div>
<p>Ok, but it's still called <code>ai</code>, where is Project AIRI then?</p>
<h2 id="reborn%2C-with-stronger%2C-and-better-determination" tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#reborn%2C-with-stronger%2C-and-better-determination" class="header-anchor">Reborn, with stronger, and better determination</a></h2>
<p>Someday near the end of 2024, November, <a href="https://github.com/kwaa">@kwaa</a>
chatted me about making virtual characters in VR/AR world, with the power of WebXR.
When we talked about the motion control and the character emotion detection, I told
they I got a project that did exactly what you are looking for, but codebase wasn't
organized, nor ready to be published to GitHub.</p>
<p>What to wait for? I started to work on it again, rethink about the structure and design,
improved the implementation with much faster and better queueing and multiplexing playback
system, and adjustments on the basic WebUI I made randomly, finally, I published it to
GitHub on <strong>December 2, 2024</strong> with commit
<a href="https://github.com/moeru-ai/airi/commit/d9ae0aae387f015964bfd383e6d2adb05f4003e4"><code>d9ae0aa</code></a>.</p>
<p>Project AIRI was somehow born or reborn, with the name of AIRI (アイリ, formerly Airi).</p>
<div style="display: flex; flex-direction: column; background-color:rgba(159, 28, 246, 0.08); padding: 1rem; margin-bottom: 1rem;">
  <div style="font-weight: 600; font-size: 1.2rem; margin-bottom: 0.5rem;">
    Did you know?
  </div>
  <a href="https://www.youtube.com/watch?v=Tts-YAdn5Yc">
    <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/airis-screenshot-1.png" alt="Screenshot of Project AIRI" style="border-radius: 12px;" />
  </a>
<p>Interestingly, from the upload 2 years ago, March 25, 2023, <a href="https://www.youtube.com/watch?v=Tts-YAdn5Yc">https://www.youtube.com/watch?v=Tts-YAdn5Yc</a>, a clip
from Twitch stream of Vedal, and Neuro-sama, Vedal mentioned that right before she called the name &quot;Neuro-sama&quot;,
she was called &quot;Airis AI&quot;, the name <strong>Airis</strong> magically, and coincidentally, matches the name of
<strong>Project AIRI</strong>, which I am working on now. But I wasn't aware of this name until I searches more about their
stories long after I open sourced Project AIRI.</p>
<p>In fact, the name AIRI (アイリ) was named by GPT-4o, I asked it about naming this project by
referencing other Japanese / or Anime-ish names, it suggested the name <strong>Airi</strong>.</p>
</div>
<p>I failed so many of times on startups and other projects, only the recent ones become known by the public,
I tried my best to make it better, with better UI, better code structure, leading technologies to build
and code with rapid speed. I put so much effort into it with public slides show, and demonstrate it to
others to my friends and during small meetups and conferences.</p>
<p>Many of those experiences was learned from my previous failures.</p>
<p>Glad many trials succeeded, and I am still here, working on Project AIRI.</p>
<p>Perhaps, it's another time that my determination was filled by not only Neuro-sama, but also the
most profound, talented contributors, and fans.</p>
<h2 id="keep-going%2C-keep-dreaming" tabindex="-1"><a href="https://blog.moeru.ai/proj-airi-dreamlog-0x1-backstory/#keep-going%2C-keep-dreaming" class="header-anchor">Keep going, keep dreaming</a></h2>
<div className="w-full flex flex-col items-center justify-center">
  <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/banner-light-1280x640.png" />
  <div>
    New Banner!
  </div>
</div>
<blockquote>
<p>When life gives you lemons, you lemon. Or something like that, my point
is that this painful obstacle is an opportunity for me go get stronger, baby!</p>
<p>-- quote from <a href="https://www.youtube.com/@Neurosama">Evil Neuro</a> when streaming playing Slay the Spire</p>
</blockquote>
<p>Now, Project AIRI is approaching to 1000 stars on GitHub when I am writing this post,
while having over 150 Discord members, and 200 Telegram group members.</p>
<p>We covers fields like AI, VRM, Live2D, UI design, multi-modal AI, game playing agents,
streaming APIs, bionic memory mechanisms, and many more. She is capable of playing games
like Minecraft, Factorio. We got another community member who is researching on
integrating her to be able to play and control Kerbal Space Program (KSP), as well as
play any arbitrary games.</p>
<p>Many other companies are reaching out to us asking for collaboration, and we are
working on it, to make Project AIRI better, and more useful for the community.</p>
<p>There is so much to do, and discover, we haven't reached the singularity of general purpose AI,
perhaps Project AIRI will never made that point, but for now, having a companion-like AI agent
to talk to, play games with, and share the knowledge and ideas with, is already a great
achievement for me, and I hope it is for you too.</p>
<p>This is only the beginning memory address of our dreams, <code>0x1</code>, the first byte of our journey.</p>
<p>How much memory we could store? <strong>It depends on how much we could dream, and how much we could achieve together.</strong></p>
<div style="margin-top: 24px; margin-bottom: 24px; width: 100%; display: flex; flex-direction: column; align-items: center; justify-content: center;">
  <img src="https://raw.githubusercontent.com/moeru-ai/airi/main/docs/src/assets/images/blog/DreamLog-0x1/relu-sticker-wow.png" alt="ReLU sticker wow" style="width: 120px;" />
  <div style="text-align: center;">
    <span style="font-weight: bold; display: block;">Thanks for reading all the way here!</span>
    <span>Thanks for reading! Oh, and, Happy Birthday, Project AIRI!</span>
  </div>
</div>
<div style="display: flex; flex-direction: column; background-color:rgba(55, 55, 55, 0.08); padding: 1rem; margin-bottom: 1rem;">
  <div style="font-weight: 600; font-size: 1.2rem; margin-bottom: 0.5rem;">
    New Release!
  </div>
  <div>
    While you are reading this, we are preparing for the next release of Project AIRI, v0.6.0. Stay tuned!
  </div>
</div>
]]>
      </content:encoded>
      <pubDate>Mon, 16 Jun 2025 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Announcing xsAI 0.2 "over the reality"</title>
      <link>https://blog.moeru.ai/xsai-0.2/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/xsai-0.2/</guid>
      <content:encoded>
        <![CDATA[<p>I'm pleased to announce the release of xsAI v0.2.</p>
<p>This version codename still corresponds to a song by Kizuna AI and you can listen to it:</p>
<iframe width="100%" height="405" src="https://www.youtube.com/embed/OIdlW0u3ZXc" title="YouTube video player" frameborder="0" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>
<p><s>btw, v0.1 is <a href="https://www.youtube.com/watch?v=FrcR9qvjwmo">&quot;Hello World&quot;</a></s></p>
<p>OK, so here's the new features:</p>
<ul>
<li><a href="https://blog.moeru.ai/xsai-0.2/#generate-image">Generate image</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.2/#reasoning-utils">Reasoning utils</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.2/#more-schema-library-supported">More schema library supported</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.2/#more-providers">More providers</a></li>
<li><a href="https://blog.moeru.ai/xsai-0.2/#more-integrations">More integrations</a></li>
</ul>
<h2 id="generate-image" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#generate-image" class="header-anchor">Generate Image</a></h2>
<p>GPT 4o Image Generation is very popular these days, isn't it?</p>
<p>Now you can also use it via API and <a href="https://xsai.js.org/docs/packages/generate/image"><code>@xsai/generate-image</code></a>:</p>
<pre><code class="language-ts">import { generateImage } from '@xsai/generate-image'
import { env } from 'node:process'

const prompt = 'A children\'s book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter.'

const { image } = await generateImage({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'http://api.openai.com/v1/',
  model: 'gpt-image-1',
  prompt,
})

const { images } = await generateImage({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'http://api.openai.com/v1/',
  n: 4,
  model: 'gpt-image-1',
  prompt,
})
</code></pre>
<p>If this feature is popular, we may introduce <code>editImage</code> later.</p>
<h2 id="reasoning-utils" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#reasoning-utils" class="header-anchor">Reasoning Utils</a></h2>
<p>We made <a href="https://xsai.js.org/docs/packages/utils/reasoning"><code>@xsai/utils-reasoning</code></a> for models like <code>qwq</code> and <code>deepseek-r1</code>:</p>
<pre><code class="language-ts">import { generateText } from '@xsai/generate-text'
import { streamText } from '@xsai/stream-text'
import { extractReasoning, extractReasoningStream } from '@xsai/utils-reasoning'

const messages = [
  {
    content: 'You\'re a helpful assistant.',
    role: 'system'
  },
  {
    content: 'Why is the sky blue?',
    role: 'user'
  },
]

const { text: rawText } = await generateText({
  baseURL: 'http://localhost:11434/v1/',
  messages,
  model: 'deepseek-r1',
})

const { textStream: rawTextStream } = await streamText({
  baseURL: 'http://localhost:11434/v1/',
  messages,
  model: 'deepseek-r1',
})

// { reasoning: string | undefined, text: string }
const { reasoning, text } = extractReasoning(rawText!)
// { reasoningStream: ReadableStream&lt;string&gt;, textStream: ReadableStream&lt;string&gt; }
const { reasoningStream, textStream } = extractReasoningStream(rawTextStream)
</code></pre>
<h2 id="more-schema-library-supported" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#more-schema-library-supported" class="header-anchor">More schema library supported</a></h2>
<p>We have supported <a href="https://v4.zod.dev/">Zod 4 Beta</a> before it was officially released. (also includes <a href="https://v4.zod.dev/packages/mini"><code>@zod/mini</code></a>!)</p>
<p><a href="https://effect.website/docs/schema/introduction/">Effect Schema</a> is supported as well.</p>
<h2 id="more-providers" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#more-providers" class="header-anchor">More providers</a></h2>
<p>We now support <a href="https://featherless.ai/">Featherless</a>.</p>
<p>Did you know we've added a lot of providers? view <a href="https://github.com/moeru-ai/xsai/tree/main/packages-ext/providers-cloud/src/providers">here</a>.</p>
<h3 id="special-providers" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#special-providers" class="header-anchor">Special providers</a></h3>
<h4 id="new-%F0%9F%A4%97-transformer.js-provider" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#new-%F0%9F%A4%97-transformer.js-provider" class="header-anchor">New 🤗 Transformer.js provider</a></h4>
<p>Have you dreamed about a possible future where you can use <code>xsAI</code> completely offline without Ollama and other inference server setup? We mentioned a bit in the previous blog post about our roadmap, here we come.</p>
<p>We now get a new dedicated project called <a href="https://github.com/moeru-ai/xsai-transformers"><code>xsai-transformers</code></a> on GitHub, where we wrapped the famous library to work with models and inference, <a href="https://huggingface.co/docs/transformers.js/en/index"><code>Transformer.js</code></a> to help you get started on running embedding, speech, transcription, chat completions models with seamlessly designed API that compatible to xsAI, in both browser, WASM supported or WebGPU supported environments.</p>
<p>If you are interested, <a href="https://xsai-transformers.netlify.app/">try it on our live demo</a>.</p>
<pre><code class="language-bash">npm i xsai-transformers
</code></pre>
<p>It feels like this when using it:</p>
<pre><code class="language-typescript">import { createEmbedProvider } from 'xsai-transformers'
import embedWorkerURL from 'xsai-transformers/embed/worker?worker&amp;url'
import { embed } from 'xsai'

const transformers = createEmbedProvider({ baseURL: `xsai-transformers:///?worker-url=${embedWorkerURL}` })

// [
//   -0.038177140057086945,
//   0.032910916954278946,
//   -0.005459371022880077,
//   // ...
// ]
const { embedding } = await embed({
  ...transformers.embed('Xenova/all-MiniLM-L6-v2'),
  input: 'sunny day at the beach'
})
</code></pre>
<h4 id="unspeech" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#unspeech" class="header-anchor">unSpeech</a></h4>
<p>While you may notice <a href="https://github.com/moeru-ai/xsai/pull/136">we removed unSpeech from <code>@xsai-ext/providers-local</code></a> (our written provider to connect speech synthesis services with the style of OpenAI API), this doesn't mean we completely gave up of unSpeech, instead, for the past month, we added support of <a href="https://www.alibabacloud.com/en/product/modelstudio">Alibaba Cloud Model Studio</a> and <a href="https://www.volcengine.com/product/voice-tech">Volcano Engine</a> to unSpeech.</p>
<p>Therefore, it's time for <a href="https://www.npmjs.com/package/unspeech">unSpeech to get its own package</a>, you can still use all the previous provided features by installing <a href="https://www.npmjs.com/package/unspeech"><code>unspeech</code></a>:</p>
<pre><code class="language-bash">npm i unspeech
</code></pre>
<h2 id="more-integrations" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#more-integrations" class="header-anchor">More Integrations</a></h2>
<p>Did you know? We now have official <a href="https://agentic.so/sdks/xsai">Agentic</a> and <a href="https://voltagent.dev/docs/providers/xsai/">VoltAgent</a> integrations.</p>
<h2 id="join-our-community" tabindex="-1"><a href="https://blog.moeru.ai/xsai-0.2/#join-our-community" class="header-anchor">Join our Community</a></h2>
<p>If you have questions about anything related to xsAI,</p>
<p>you're always welcome to ask our community on <a href="https://github.com/moeru-ai/xsai/discussions">GitHub Discussions</a>.</p>
]]>
      </content:encoded>
      <pubDate>Thu, 01 May 2025 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Introducing xsAI, a &lt; 6KB Vercel AI SDK alternative</title>
      <link>https://blog.moeru.ai/introducing-xsai/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/introducing-xsai/</guid>
      <content:encoded>
        <![CDATA[<h2 id="why-another-ai-sdk%3F" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#why-another-ai-sdk%3F" class="header-anchor">Why another AI SDK?</a></h2>
<p><a href="https://sdk.vercel.ai/">Vercel AI SDK</a> is way too big, it includes unnecessary dependencies.</p>
<p><a href="https://pkg-size.dev/ai@4.1.47"><img src="https://blog.moeru.ai/images/pkg-size-ai.png" alt="pkg-size-ai"></a></p>
<p>For example, Vercel AI SDK shipped with non-optional
<a href="https://opentelemetry.io/">OpenTelemetry</a> dependencies, and bind the user to use <a href="https://zod.dev/">zod</a> (you don't
get to choose), and so much more...</p>
<p>This makes it hard to build small and decent AI applications &amp; CLI tools with less bundle size and more controllable
and atomic capabilities that user truly needed.</p>
<p>But, it doesn't need to be like this, isn't it?</p>
<h3 id="so-how-small-is-xsai%3F" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#so-how-small-is-xsai%3F" class="header-anchor">So how small is xsAI?</a></h3>
<p>Without further ado, let's look:</p>
<p><a href="https://pkg-size.dev/xsai@0.1.0-beta.9"><img src="https://blog.moeru.ai/images/pkg-size-xsai.png" alt="pkg-size-xsai"></a></p>
<p>It's roughly a hundred times smaller than the Vercel AI SDK (*install size) and has most of its features.</p>
<p>Also it is 5.7KB gzipped, so the title is not wrong.</p>
<p><a href="https://pkg-size.dev/xsai@0.1.0-beta.9"><img src="https://blog.moeru.ai/images/pkg-size-xsai-bundle.png" alt="pkg-size-xsai-bundle"></a></p>
<h2 id="getting-started" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#getting-started" class="header-anchor">Getting started</a></h2>
<p>You can install the <code>xsai</code> package, which contains all the core utils.</p>
<pre><code class="language-bash">npm i xsai
</code></pre>
<p>Or install the corresponding packages separately according to the required
features:</p>
<pre><code class="language-bash">npm i @xsai/generate-text @xsai/embed @xsai/model
</code></pre>
<h3 id="generating-text" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#generating-text" class="header-anchor">Generating Text</a></h3>
<p>So let's start with some simple examples.</p>
<pre><code class="language-ts">import { generateText } from '@xsai/generate-text'
import { env } from 'node:process'

const { text } = await generateText({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'https://api.openai.com/v1/',
  model: 'gpt-4o'
  messages: [{
    role: 'user',
    content: 'Why is the sky blue?',
  }],
})
</code></pre>
<p>xsAI does not use the provider function <a href="https://sdk.vercel.ai/docs/foundations/providers-and-models">like Vercel does</a> by default,
we simplified them into three shared fields: <code>apiKey</code>, <code>baseURL</code> and <code>model</code>.</p>
<ul>
<li><code>apiKey</code>: Provider API Key</li>
<li><code>baseURL</code>: Provider Base URL (will be merged with the path of the corresponding util, e.g. <code>new URL('chat/completions', 'https://api.openai.com/v1/')</code>)</li>
<li><code>model</code>: Name of the model to use</li>
</ul>
<blockquote>
<p>Don't worry if you need to support non-OpenAI-compatible API provider, such as <a href="https://claude.ai/">Claude</a>, we left the possibilities to override
<a href="https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API/Using_Fetch"><code>fetch(...)</code></a> where you can customize how the request is made,
and how the response was handled.</p>
</blockquote>
<p>This allows xsAI to support any OpenAI-compatible API without having to create provider packages.</p>
<h3 id="generating-text-w%2F-tool-calling" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#generating-text-w%2F-tool-calling" class="header-anchor">Generating Text w/ Tool Calling</a></h3>
<p>Continuing with the example above, we now add the tools.</p>
<pre><code class="language-ts">import { generateText } from '@xsai/generate-text'
import { tool } from '@xsai/tool'
import { env } from 'node:process'
import * as z from 'zod'

const weather = await tool({
  name: 'weather',
  description: 'Get the weather in a location',
  parameters: z.object({
    location: z.string().describe('The location to get the weather for'),
  }),
  execute: async ({ location }) =&gt; ({
    location,
    temperature: 72 + Math.floor(Math.random() * 21) - 10,
  }),
})

const { text } = await generateText({
  apiKey: env.OPENAI_API_KEY!,
  baseURL: 'https://api.openai.com/v1/',
  model: 'gpt-4o'
  messages: [{
    role: 'user',
    content: 'What is the weather in San Francisco?',
  }],
  tools: [weather],
})
</code></pre>
<p>Wait, <a href="https://zod.dev/"><code>zod</code></a> is not good for tree shaking and annoying. Can we use <a href="https://valibot.dev/"><code>valibot</code></a>? <strong>Of course!</strong></p>
<pre><code class="language-ts">import { tool } from '@xsai/tool'
import { description, object, pipe, string } from 'valibot'

const weather = await tool({
  name: 'weather',
  description: 'Get the weather in a location',
  parameters: object({
    location: pipe(
      string(),
      description('The location to get the weather for'),
    ),
  }),
  execute: async ({ location }) =&gt; ({
    location,
    temperature: 72 + Math.floor(Math.random() * 21) - 10,
  }),
})
</code></pre>
<p>We can even use <a href="https://arktype.io/"><code>arktype</code></a>, and the list of compatibility will grow in the future:</p>
<pre><code class="language-ts">import { tool } from '@xsai/tool'
import { type } from 'arktype'

const weather = await tool({
  name: 'weather',
  description: 'Get the weather in a location',
  parameters: type({
    location: 'string',
  }),
  execute: async ({ location }) =&gt; ({
    location,
    temperature: 72 + Math.floor(Math.random() * 21) - 10,
  }),
})
</code></pre>
<blockquote>
<p>xsAI doesn't limit your choices into either <a href="https://zod.dev/"><code>zod</code></a>, <a href="https://valibot.dev/"><code>valibot</code></a>, or <a href="https://arktype.io/"><code>arktype</code></a>, with
the power of <a href="https://github.com/standard-schema/standard-schema">Standard Schema</a>, you can use any schema library it supported you like.</p>
</blockquote>
<h3 id="easy-migration" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#easy-migration" class="header-anchor">Easy migration</a></h3>
<p>Are you already using the Vercel AI SDK? Let's see how to migrate to xsAI:</p>
<pre><code class="language-diff">- import { openai } from '@ai-sdk/openai'
- import { generateText, tool } from 'ai'
+ import { generateText, tool } from 'xsai'
+ import { env } from 'node:process'
import * as z from 'zod'

const { text } = await generateText({
+ apiKey: env.OPENAI_API_KEY!,
+ baseURL: 'https://api.openai.com/v1/',
- model: openai('gpt-4o')
+ model: 'gpt-4o'
  messages: [{
    role: 'user',
    content: 'What is the weather in San Francisco?',
  }],
- tools: {
+ tools: [
-   weather: tool({
+   await tool({
+     name: 'weather',
      description: 'Get the weather in a location',
      parameters: z.object({
        location: z.string().describe('The location to get the weather for'),
      }),
      execute: async ({ location }) =&gt; ({
        location,
        temperature: 72 + Math.floor(Math.random() * 21) - 10,
      }),
    })
- },
+ ],
})
</code></pre>
<p>That's it!</p>
<h2 id="next-steps" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#next-steps" class="header-anchor">Next steps</a></h2>
<h3 id="big-fan-of-anthropic's-mcp%3F" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#big-fan-of-anthropic's-mcp%3F" class="header-anchor">Big fan of <a href="https://www.anthropic.com/news/model-context-protocol">Anthropic's MCP</a>?</a></h3>
<p>We are working on <a href="https://modelcontextprotocol.io/introduction">Model Context Protocol</a> support: <a href="https://github.com/moeru-ai/xsai/pull/84">#84</a></p>
<h3 id="don't-like-any-of-the-cloud-provider%3F" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#don't-like-any-of-the-cloud-provider%3F" class="header-anchor">Don't like any of the cloud provider?</a></h3>
<p>We are working on a <a href="https://huggingface.co/docs/transformers.js/index">🤗 Transformers.js</a> provider that enables you to directly run LLMs and any
🤗 Transformers.js supported models directly in browser, with the power of WebGPU!</p>
<p>You can track the progress here: <a href="https://github.com/moeru-ai/xsai/issues/41">#41</a>. It is really cool and playful to run embedding, speech,
and transcribing models directly in the browser, so, stay tuned!</p>
<h3 id="need-framework-bindings%3F" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#need-framework-bindings%3F" class="header-anchor">Need framework bindings?</a></h3>
<p>We will do this in v0.2. See you next time!</p>
<h2 id="documentation" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#documentation" class="header-anchor">Documentation</a></h2>
<p>Since this is just an introduction article, it only covers <code>generate-text</code> and <code>tool</code>.</p>
<p><code>xsai</code> <a href="https://github.com/moeru-ai/xsai/blob/main/packages/xsai/src/index.ts">has more utils:</a></p>
<pre><code class="language-ts">export * from '@xsai/embed'
export * from '@xsai/generate-object'
export * from '@xsai/generate-speech'
export * from '@xsai/generate-text'
export * from '@xsai/generate-transcription'
export * from '@xsai/model'
export * from '@xsai/shared-chat'
export * from '@xsai/stream-object'
export * from '@xsai/stream-text'
export * from '@xsai/tool'
export * from '@xsai/utils-chat'
export * from '@xsai/utils-stream'
</code></pre>
<p>If you are interested, go to the documentation at <a href="https://xsai.js.org/docs">https://xsai.js.org/docs</a> to get started!</p>
<p>Besides xsAI, we made loads of other cool stuff too! Check out our <a href="https://github.com/moeru-ai"><code>moeru-ai</code> GitHub organization</a>!</p>
<h2 id="join-our-community" tabindex="-1"><a href="https://blog.moeru.ai/introducing-xsai/#join-our-community" class="header-anchor">Join our Community</a></h2>
<p>If you have questions about anything related to xsAI,</p>
<p>you're always welcome to ask our community on <a href="https://github.com/moeru-ai/xsai/discussions">GitHub Discussions</a>.</p>
]]>
      </content:encoded>
      <pubDate>Mon, 03 Mar 2025 00:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Hello, World!</title>
      <link>https://blog.moeru.ai/hello-world/</link>
      <guid isPermaLink="false">https://blog.moeru.ai/hello-world/</guid>
      <content:encoded><![CDATA[<p>Welcome to the Moeru AI Blog.</p>
]]></content:encoded>
      <pubDate>Tue, 25 Feb 2025 00:00:00 GMT</pubDate>
    </item>
  </channel>
</rss>