<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>Martyna Mul, Autor w serwisie Inero Software - Software Consulting</title>
	<atom:link href="https://inero-software.com/author/martynamul/feed/" rel="self" type="application/rss+xml" />
	<link>https://inero-software.com/author/martynamul/</link>
	<description>We unleash innovations using cutting-edge technologies, modern design and AI</description>
	<lastBuildDate>Fri, 16 May 2025 09:27:59 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>

<image>
	<url>https://inero-software.com/wp-content/uploads/2018/11/inero-logo-favicon.png</url>
	<title>Martyna Mul, Autor w serwisie Inero Software - Software Consulting</title>
	<link>https://inero-software.com/author/martynamul/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">153509928</site>	<item>
		<title>LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</title>
		<link>https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Wed, 14 May 2025 06:44:35 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[BusinessProcessesOptimization]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[cost]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7981</guid>

					<description><![CDATA[<p>In this post we discuss the types of costs associated with using dedicated LLMs and present example calculations for popular models (such as GPT-4, Claude, Mistral, LLaMA, etc.), including business use case scenarios.</p>
<p>Artykuł <a href="https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/">LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7981" class="elementor elementor-7981" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-b624393 e-flex e-con-boxed e-con e-parent" data-id="b624393" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-93f3c2f elementor-widget elementor-widget-html" data-id="93f3c2f" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-3d9c5ec elementor-widget elementor-widget-text-editor" data-id="3d9c5ec" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>When considering the introduction of artificial intelligence into your company, it’s important to understand the costs involved in implementing and maintaining your own LLM. Expenses go beyond just paying for model usage (e.g., token-based API fees) and include a range of factors — from infrastructure to security. Below, we discuss the types of costs associated with using dedicated LLMs and present example calculations for popular models (such as GPT-4, Claude, Mistral, LLaMA, etc.), including business use case scenarios.</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-085701f elementor-widget elementor-widget-text-editor" data-id="085701f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>More and more companies are considering the use of large language models (LLMs) in their own products and processes. These “dedicated” models can act as intelligent assistants—answering customer questions, analyzing documents, generating reports, and much more. <a href="https://inero-software.com/chatbot-agent-or-ai-assistant-find-out-which-solution-is-best-for-your-business/">You can read more about it here.</a></p><p><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4636eb2 elementor-widget elementor-widget-heading" data-id="4636eb2" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Types of Costs When Using LLMs</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-dc7b85d elementor-widget elementor-widget-text-editor" data-id="dc7b85d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Before starting the implementation, it&#8217;s important to understand all the components that contribute to the total cost of using a dedicated model.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d01d87f elementor-widget elementor-widget-heading" data-id="d01d87f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Infrastructure:
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-556fadf elementor-widget elementor-widget-text-editor" data-id="556fadf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you&#8217;re using models via a cloud API (OpenAI, Anthropic, Google), </strong>you only pay for the tokens used. The infrastructure cost is &#8220;hidden&#8221; on the provider&#8217;s side.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-fca6d2f elementor-widget elementor-widget-text-editor" data-id="fca6d2f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you choose to self-host a model such as Mistral or LLaMA, </strong>you’ll need to maintain a GPU server—either locally or in the cloud. For example, renting an instance with an A100 GPU typically costs $1–2 per hour, which amounts to $750–1,500 per month if the server runs continuously. While such an investment can handle a high volume of queries, it may be underutilized at a smaller scale.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-6ef6f58 elementor-widget elementor-widget-heading" data-id="6ef6f58" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Licensing and Model Fees
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-275e876 elementor-widget elementor-widget-text-editor" data-id="275e876" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Commercial models come with licensing or subscription fees. For example, when using the GPT-4 API from OpenAI or Claude from Anthropic,<strong> you pay per token used</strong> according to the provider&#8217;s pricing (we outline token costs in detail later on). On the other hand, open-source models like LLaMA or Mistral are available for free—<strong>there are no licensing or token fees</strong>. Meta, for instance, released LLaMA 2 under a license that allows businesses to use it freely. However, “free” doesn’t mean zero cost—you’ll still pay for the infrastructure and electricity needed to run the model (as mentioned earlier). It’s also important to check license restrictions: some open models may have specific usage conditions (e.g., restrictions on certain industries).</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aa18bfc elementor-widget elementor-widget-heading" data-id="aa18bfc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Model Adaptation and Customization
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-96aa203 elementor-widget elementor-widget-text-editor" data-id="96aa203" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>For an LLM to perform well in a specific company setting, it often requires customization—such as additional training (fine-tuning) on company-specific data or at least the preparation of tailored prompts (known as prompt engineering). This adaptation process can generate significant costs:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-8573d17 elementor-widget elementor-widget-text-editor" data-id="8573d17" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Model Fine-Tuning:</strong> Training a model on your own dataset requires computing power (typically GPUs running for many hours) and expert knowledge. For larger models, this can cost anywhere from several thousand to tens of thousands of dollars—factoring in both infrastructure expenses and specialist time. Even fine-tuning a smaller model (e.g., GPT-3.5) via OpenAI’s API can incur significant costs, as it involves processing hundreds of thousands or even millions of tokens during training—billed according to the provider’s token pricing.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-092f2e3 elementor-widget elementor-widget-text-editor" data-id="092f2e3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Prompt Engineering:</strong> As an alternative or complement to training, you can craft tailored prompts and instructions for the model. While writing prompts itself doesn’t require paid resources, iteratively testing and refining multiple versions consumes tokens (which adds cost when using a cloud-based model) and takes up team time. This can be viewed as either an operational cost or a competence-related expense—specialist time is needed to optimize the model’s behavior for your specific use case.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-b4d3407 elementor-widget elementor-widget-heading" data-id="b4d3407" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Operational Costs
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-d96252c elementor-widget elementor-widget-text-editor" data-id="d96252c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>After deploying the model, ongoing operational costs come into play. These include monitoring the model’s performance, maintaining efficiency, logging results, applying updates, and fixing potential issues. If you&#8217;re using an API, the main operational <strong>cost</strong> <strong>will be the monthly bill for consumed tokens,</strong> along with any premium subscription fees (some providers offer subscription plans with usage limits or preferred pricing). If the model is hosted locally, operational costs typically include:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-15a5e0f elementor-widget elementor-widget-text-editor" data-id="15a5e0f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Electricity consumption</strong> – GPU-based models can consume significant amounts of power, leading to substantial monthly energy costs.</p></li><li><p><strong>System administration</strong> – Time spent by administrators on server maintenance, backups, and updating software components (e.g., AI libraries).</p></li><li><p><strong>Infrastructure scaling</strong> – As demand grows, additional machines or cloud instances may be needed, resulting in further expenses.</p></li><li><p><strong>High availability</strong> – If the LLM assistant needs to operate 24/7 without downtime, you may need to invest in redundant resources (e.g., backup servers) or enter into an SLA agreement with your cloud provider.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-62dc195 elementor-widget elementor-widget-heading" data-id="62dc195" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Team Expertise
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-3d2c4a9 elementor-widget elementor-widget-text-editor" data-id="3d2c4a9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Implementing an LLM requires the right expertise within the IT/Data team. If your company lacks AI experience, it may be necessary to train existing employees or hire new specialists—such as an ML engineer or MLOps expert—which adds recruitment or training costs. Alternatively, some companies choose to work with external consultants or service providers to deploy the model. This also incurs costs, usually one-time project fees, which can be significant. It&#8217;s also important to account for the time your team spends integrating the model with existing systems (e.g., connecting it to a database or user-facing application). This is a labor cost that’s often overlooked in smaller projects but can have a major impact in practice.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-980dd92 elementor-widget elementor-widget-text-editor" data-id="980dd92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The categories above show that the total cost of owning a dedicated LLM-based solution goes far beyond just the fee for accessing the model. It&#8217;s important to consider all these factors before making a decision. In the next section, we’ll look at specific numbers: how much a single prompt costs for various popular models, and what it would take to maintain a simple LLM assistant in two example business scenarios.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aa5ede7 elementor-widget elementor-widget-spacer" data-id="aa5ede7" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0acc8bb elementor-widget elementor-widget-heading" data-id="0acc8bb" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Cost of a Single Prompt in Popular LLM Models
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-37ada92 elementor-widget elementor-widget-text-editor" data-id="37ada92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Language models are typically billed based on the number of tokens processed. A token is a small piece of text—it may represent a single word or part of a word (for example, 1,000 tokens roughly equals 750 words of continuous text). API providers list prices per 1,000 or 1 million tokens.</p><p>Below is a comparison of the approximate cost to process 1,000 tokens using selected popular LLM models:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-94811ff elementor-widget elementor-widget-html" data-id="94811ff" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>LLM Model Comparison</title>
  <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
  <style>
    body {
      font-family: 'Roboto', sans-serif;
      font-weight: 300;
      font-size: 14px;
      color: #1C244B;
    }
    table {
      width: 100%;
      border-collapse: collapse;
    }
    th, td {
      border: 1px solid #ccc;
      padding: 8px;
      vertical-align: top;
    }
    th {
      background-color: #f2f2f2;
    }
    td ul {
      margin: 0;
      padding-left: 18px;
    }
  </style>
</head>
<body>

<table>
  <thead>
    <tr>
      <th>LLM Model</th>
      <th>Access / License</th>
      <th>Cost per 1000 tokens</th>
      <th>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GPT-3.5 Turbo (OpenAI)</td>
      <td>Cloud API (chat model available, e.g., in ChatGPT)</td>
      <td>$0.0015 (input)<br>$0.0020 (output)</td>
      <td>
        <ul>
          <li>Very low cost – 16k tokens + paid upgrade to 128k</li>
          <li>Good response quality</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 (8k)</td>
      <td>Cloud API (OpenAI)</td>
      <td>$0.08 (input)<br>$0.16 (output)</td>
      <td>High quality; high cost</td>
    </tr>
    <tr>
      <td>GPT-4 Turbo (128k)</td>
      <td>Cloud API (OpenAI)</td>
      <td>$0.01 (input)<br>$0.03 (output)</td>
      <td>
        <ul>
          <li>Reliable large context (up to 128k tokens)</li>
          <li>Cheaper (only slightly more than GPT-3.5)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude Instant v1.2</td>
      <td>Cloud API (Anthropic)</td>
      <td>$0.0008 (input)<br>$0.0024 (output)</td>
      <td>
        <ul>
          <li>Fast, lower-cost Claude model (equivalent to GPT-3.5)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude 2 (100k)</td>
      <td>Cloud API (Anthropic)</td>
      <td>$0.008 (input)<br>$0.024 (output)</td>
      <td>
        <ul>
          <li>High-quality model by Anthropic; context up to 100k tokens</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Mistral 7B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Requires self-hosting</li>
          <li>Alternative to GPT-3.5 – low hardware requirements (can run with <1M tokens)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 13B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Self-hosting required</li>
          <li>Needs stronger hardware (e.g., 2× 24GB GPU) than 7B, but still accessible for many companies</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 70B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Requires self-hosting</li>
          <li>Requires expensive infrastructure (e.g., 8× 80GB GPUs)</li>
          <li>At this scale, costs may match or even exceed GPT-4</li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>

</body>
</html>
		</div>
				</div>
				<div class="elementor-element elementor-element-6267324 elementor-widget elementor-widget-text-editor" data-id="6267324" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p class="" data-start="67" data-end="109"><strong data-start="67" data-end="109">Legend: How Token Costs Are Calculated</strong></p><ul><li style="list-style-type: none;"><ul data-start="111" data-end="248"><li class="" data-start="111" data-end="171"><p class="" data-start="113" data-end="171"><strong data-start="113" data-end="129">Input tokens</strong> – words contained in the user&#8217;s prompt.</p></li><li class="" data-start="172" data-end="248"><p class="" data-start="174" data-end="248"><strong data-start="174" data-end="191">Output tokens</strong> – words generated by the model in response (completion).</p></li></ul></li></ul><p class="" data-start="250" data-end="353">For most commercial providers, the cost is charged separately for input and output tokens. For example:</p><p class="" data-start="355" data-end="371"><strong data-start="355" data-end="371">GPT-4 Turbo:</strong></p><ul><li style="list-style-type: none;"><ul data-start="373" data-end="439"><li class="" data-start="373" data-end="406"><p class="" data-start="375" data-end="406">1,000 input tokens: <strong data-start="395" data-end="404">$0.03</strong></p></li><li class="" data-start="407" data-end="439"><p class="" data-start="409" data-end="439">1,000 output tokens: <strong data-start="430" data-end="439">$0.06</strong></p></li></ul></li></ul><p class="" data-start="441" data-end="557">If a dialogue contains a total of 1,000 tokens (e.g., 500 input + 500 output), the cost is approximately <strong data-start="546" data-end="556">$0.045</strong>.</p><p class="" data-start="559" data-end="652">For simplicity, you can assume that a full interaction of 1,000 tokens costs about <strong data-start="642" data-end="651">$0.09</strong>.</p><p class="" data-start="654" data-end="672"><strong data-start="654" data-end="672">By comparison:</strong></p><ul><li style="list-style-type: none;"><ul data-start="674" data-end="969" data-is-last-node="" data-is-only-node=""><li class="" data-start="674" data-end="777"><p class="" data-start="676" data-end="777"><strong data-start="676" data-end="693">GPT-3.5 Turbo</strong> – a similar 1,000-token dialogue costs only about <strong data-start="744" data-end="755">$0.0035</strong> (i.e., 0.35 cents).</p></li><li class="" data-start="778" data-end="969"><p class="" data-start="780" data-end="969"><strong data-start="780" data-end="802">Open-source models</strong> (e.g., Mistral, LLaMA) – token costs are <strong data-start="844" data-end="850">$0</strong>, since the models run locally. You only pay for infrastructure-related costs (power consumption, server uptime, etc.).</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-2c3b4b9 elementor-widget elementor-widget-text-editor" data-id="2c3b4b9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Open-source models (such as Mistral, LLaMA, etc.) are attractive because they come with no fees for the model itself—you can generate any number of tokens without paying the model provider a cent. However, to run these models, you need to maintain your own infrastructure. At a small scale, the cost of renting a machine for a single query may actually exceed the cost of an individual API call to a model like GPT. On the other hand, at a large scale—with many queries per day—open-source solutions can become significantly more cost-effective. In summary, cost-effectiveness depends on the use case, which we’ll explore in the next section.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-68c5cf5 elementor-widget elementor-widget-spacer" data-id="68c5cf5" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-eb32f74 elementor-widget elementor-widget-heading" data-id="eb32f74" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Example Costs of Implementing an LLM Assistant (100 Queries per Day)
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-d65244a elementor-widget elementor-widget-text-editor" data-id="d65244a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Let’s now consider a practical scenario: your company wants to implement a simple LLM-based virtual assistant that performs one of the following tasks:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-54a353d elementor-widget elementor-widget-text-editor" data-id="54a353d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Document analysis</strong> – e.g., the assistant reads offers or contracts and extracts key information such as clauses, deadlines, and amounts.</p></li><li><p><strong>Customer inquiry handling</strong> – e.g., the assistant replies to customer emails with questions about pricing, product availability, technical support, etc.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-e25102c elementor-widget elementor-widget-text-editor" data-id="e25102c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Let’s assume that:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e1312ca elementor-widget elementor-widget-text-editor" data-id="e1312ca" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p>The assistant will handle approximately <strong>100 interactions per day</strong>.</p></li><li><p>Each interaction consists of a <strong>prompt and a response</strong>, totaling around <strong>2,000 tokens</strong> (e.g., 1,000 tokens in the prompt—roughly 750 words or several paragraphs—and 1,000 tokens in the response, or about 750 generated words). This token size covers fairly complex queries and detailed replies.</p></li><li><p>On a monthly basis, the assistant will process around <strong>6 million tokens</strong> (3,000 interactions × 2,000 tokens = 6,000,000 tokens).</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-fd1201f elementor-widget elementor-widget-text-editor" data-id="fd1201f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>We want to compare the <strong>monthly operating costs</strong> of such an assistant depending on the choice of model and deployment approach. We&#8217;ll present two variants:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-405f91b elementor-widget elementor-widget-text-editor" data-id="405f91b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>API Variant (Closed Model):</strong> We use a commercial model via an API (e.g., OpenAI GPT or Anthropic Claude). We don’t maintain our own servers—costs are limited to token usage, billed according to the provider’s pricing.</p></li><li><p><strong>Self-Hosted Variant (Open-Source Model):</strong> We use an open-source model (e.g., Mistral or LLaMA) deployed on our own servers. Costs include infrastructure needed to support approximately 100 queries per day—such as cloud GPU instance rental or hardware amortization, plus electricity.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-0c96b1a elementor-widget elementor-widget-text-editor" data-id="0c96b1a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Below is a table comparing <strong>estimated monthly costs</strong> for several example models under both deployment variants, assuming <strong>6 million tokens per month</strong>:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-7d37b9a elementor-widget elementor-widget-html" data-id="7d37b9a" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Monthly LLM Cost Comparison</title>
  <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
  <style>
    body {
      font-family: 'Roboto', sans-serif;
      font-weight: 300;
      font-size: 14px;
      color: #1C244B;
    }
    table {
      width: 100%;
      border-collapse: collapse;
      margin-top: 20px;
    }
    th, td {
      border: 1px solid #ccc;
      padding: 8px;
      vertical-align: top;
    }
    th {
      background-color: #f2f2f2;
    }
    td ul {
      margin: 0;
      padding-left: 18px;
    }
  </style>
</head>
<body>

<table>
  <thead>
    <tr>
      <th>Model (variant)</th>
      <th>Estimated Monthly Cost</th>
      <th>Comment</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GPT-3.5 Turbo (API)</td>
      <td>approx. $18 (USD)</td>
      <td>
        <ul>
          <li>Very low cost for this quality level.</li>
          <li>Estimate: approx. $0.0027/1k tokens → $12 for generating 4M tokens + $6 for prompts → ~$18/month total.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 (8k) (API)</td>
      <td>approx. $270</td>
      <td>
        <ul>
          <li>Much higher cost for better quality.</li>
          <li>Example: 8M tokens → cost: 8M × $0.08/1k (input) + $0.16/1k (output) → $270–$540 monthly.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 Turbo (128k) (API)</td>
      <td>approx. $18</td>
      <td>
        <ul>
          <li>Slightly more expensive than GPT-3.5 due to cheaper input/output token pricing.</li>
          <li>May even deliver better quality than GPT-4 (8k).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude Instant (API)</td>
      <td>approx. $20–25</td>
      <td>
        <ul>
          <li>Comparable to GPT-3.5 in cost.</li>
          <li>Estimate: approx. $0.0021/1k tokens (input+output) → ~$18–25 for 8M tokens (plus potential flat fees).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude 2 (API)</td>
      <td>approx. $150–200</td>
      <td>
        <ul>
          <li>Cheaper than GPT-4, but still several times more expensive than GPT-3.5.</li>
          <li>Estimate: $0.032/1k tokens → ~$192 for 8M tokens.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Mistral 7B (open source, self-hosted, 1x GPU)</td>
      <td>approx. $300</td>
      <td>
        <ul>
          <li>Cost mainly for maintaining server/GPU.</li>
          <li>Assumption: 1x 24GB GPU instance – model generates ~30–60 tokens/sec, power usage 100–150W.</li>
          <li>Actual cost depends on location and usage (electricity + server = ~$300–400/month).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 70B (open source, self-hosted, multi-GPU)</td>
      <td>approx. $1,000+</td>
      <td>
        <ul>
          <li>High cost due to powerful GPU requirements.</li>
          <li>Typically requires at least 8×80GB GPUs (~$10k–12k hardware + high power consumption).</li>
          <li>Costs vary based on setup model (on-prem / cloud / GPU provider).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Local model (e.g., LLaMA 13B, GPTQ, Mistral 7B – CPU)</td>
      <td>approx. $300–500</td>
      <td>
        <ul>
          <li>Cost includes operation of local server.</li>
          <li>May be slower than GPT-3.5, but offers more privacy and control.</li>
          <li>For CPU instance (e.g., 12 cores, 64 GB RAM), monthly cost is mainly for electricity and maintenance.</li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>

</body>
</html>
		</div>
				</div>
				<div class="elementor-element elementor-element-c433e92 elementor-widget elementor-widget-text-editor" data-id="c433e92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>From the above comparison, several key takeaways can be drawn:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-cdd2a41 elementor-widget elementor-widget-text-editor" data-id="cdd2a41" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Small-scale usage (100 queries/day) favors API solutions</strong></p><p>With relatively low query volume, using a commercial API (OpenAI, Anthropic) is highly cost-effective—especially with lower-priced models like GPT-3.5 or Claude Instant, where monthly costs can be as low as a few dozen dollars. For higher-end models, monthly costs may rise to several hundred dollars. Still, at this scale, running your own GPU server at $300+ per month would be less economical than relying on cloud-based APIs.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e8cf4e9 elementor-widget elementor-widget-text-editor" data-id="e8cf4e9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Large-scale usage (thousands of queries) changes the equation</strong></p><p>If your assistant becomes successful and the number of queries increases by 10x or even 100x, the monthly API bill could grow to thousands or even tens of thousands of dollars. In such cases, investing in an open-source, self-hosted model starts to make financial sense.  With a high enough query volume, the <strong>per-request cost</strong> of running the model locally becomes lower than the API cost—since the purchased or rented hardware is being used more efficiently. In extreme cases of massive scale, some organizations may even consider training their own model from scratch—but this is typically reserved for the largest players with very substantial budgets.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-8d36cb0 elementor-widget elementor-widget-text-editor" data-id="8d36cb0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Use Case Matters (Quality vs. Cost Efficiency)</strong></p><p>Choosing the right model shouldn&#8217;t be based solely on cost—it also depends on the quality of output required for your use case. In a <strong>document analysis</strong> scenario, precision in extracting information is the top priority. A lower-cost or open-source model may be sufficient here, especially if fine-tuned to the task. A model with 7B–13B parameters can offer adequate performance at a much lower cost. Moreover, when processing <strong>sensitive documents</strong> (e.g., contracts), running the model locally ensures that the content never leaves your organization—an invaluable benefit from a legal and data privacy standpoint. On the other hand, in <strong>customer inquiry handling</strong>, where natural language quality, politeness, and contextual understanding are critical, <strong>GPT-4</strong> can significantly outperform smaller models. In this case, a company may find it worthwhile to pay more for superior customer experience.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e71a8c1 elementor-widget elementor-widget-text-editor" data-id="e71a8c1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Hidden Costs Around the Project</strong></p><p>It&#8217;s important to note that the above calculations cover only the <strong>technical costs</strong>—such as token usage or infrastructure. In practice, there are also <strong>&#8220;soft&#8221; costs</strong> to consider, including staff time for preparing the implementation, integrating the model with systems like a CRM or knowledge base, testing, and ongoing iterations and improvements. For example, if the assistant needs to retrieve data from a company&#8217;s internal document repository, those documents often need to be <strong>organized or cleaned</strong> before they can be effectively used by the model.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-a572344 elementor-widget elementor-widget-spacer" data-id="a572344" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-2a1f46d elementor-widget elementor-widget-heading" data-id="2a1f46d" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Cost Example: AI Assistant for Analyzing Emails and PDF Documents
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f3e96de elementor-widget elementor-widget-text-editor" data-id="f3e96de" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Here we also present the cost breakdown of our assistant based on Google&#8217;s Gemini model, which we described [<a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/">here</a>]. Its task is to automatically analyze incoming emails to identify insurance policies and extract key data from attached PDF documents—such as policy number, insured party address, or payment confirmation.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-149557e elementor-widget elementor-widget-text-editor" data-id="149557e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Average Token Count per Email:</strong></p><ul><li style="list-style-type: none;"><ul><li><p><strong>Input:</strong> 3,500 tokens</p></li><li><p><strong>Output:</strong> 220 tokens</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-6ac8e71 elementor-widget elementor-widget-text-editor" data-id="6ac8e71" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Analyzing 100 emails with attachments using the <strong>Gemini 2.0 Flash</strong> model costs approximately <strong>$1.50</strong>.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-6721885 elementor-widget elementor-widget-heading" data-id="6721885" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Summary</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-2655d3c elementor-widget elementor-widget-text-editor" data-id="2655d3c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Can We Afford Our Own “ChatGPT” in the Company? </strong>As we&#8217;ve seen, the answer is: <strong>it depends</strong>—primarily on the scale of usage and quality requirements. The key lies in selecting a model and deployment method that aligns with your specific needs. An <strong>iterative approach</strong> is often the most practical: start with a lower-cost model or API, evaluate the results, and scale up to a more powerful model or self-hosted solution as the project matures. Regardless of the path you choose, <strong>careful planning and cost monitoring</strong> across all categories is essential. We hope this comparison helps you make informed decisions and prepare a realistic budget for implementing a dedicated LLM in your organization.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-ec198b5 elementor-widget elementor-widget-text-editor" data-id="ec198b5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you&#8217;re considering implementing an assistant in your company, it&#8217;s worth finding answers to the following questions:</strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-22bdc83 elementor-widget elementor-widget-text-editor" data-id="22bdc83" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p>Do I need high-quality responses (e.g., GPT-4), or is an approximate answer sufficient (e.g., Claude Haiku, Gemini Flash)?</p></li><li><p>Am I processing sensitive data (e.g., customer documents)?</p></li><li><p>Do I have an IT team capable of hosting a model in-house?</p></li><li><p>What is the expected number of queries per day/month?</p></li><li><p>Is it more cost-effective to maintain my own infrastructure, or should I pay for API access?</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-f145f07 elementor-widget elementor-widget-text-editor" data-id="f145f07" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>For small to medium-scale applications, the cost of using a dedicated LLM can be quite reasonable. Thanks to cloud-based services, it’s possible to get started for just a few dozen dollars per month with models like GPT-3.5 or Claude Instant—an excellent option for experimentation and early prototypes. If you need top-tier performance, such as what GPT-4 offers, you&#8217;ll need to account for higher costs. However, even a few hundred dollars per month can be justified if the business value is significant—for example, by automating tasks that would otherwise require many hours of manual work.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b80a60d elementor-widget elementor-widget-text-editor" data-id="b80a60d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>On the other hand, for large companies planning intensive AI use, costs can grow exponentially—making it worth considering open-source options and greater investment in in-house infrastructure. Open models like LLaMA or Mistral offer freedom from per-token fees, but shift the cost burden to hardware and staffing. They become cost-effective when operating at scale or when <strong>full control over data</strong> is a top priority.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-65aa533 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="65aa533" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/contact-us/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/02/cta-AI2-1030x579.png);" role="img" aria-label="cta AI2"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Looking to Bring AI Tools into Your Company?					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We offer comprehensive technology support in the field of artificial intelligence and AI agents.
Tell us about your idea!
					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Contact Us					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/">LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7981</post-id>	</item>
		<item>
		<title>AI User Privacy: An Analysis of Platform Policies</title>
		<link>https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Wed, 30 Apr 2025 08:35:35 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Privacy Policies]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7890</guid>

					<description><![CDATA[<p>In this article, we’ll break down the data privacy policies of top AI platforms. You will also learn what to do to ensure your data is not used for training Large Language Models (LLM).</p>
<p>Artykuł <a href="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/">AI User Privacy: An Analysis of Platform Policies</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7890" class="elementor elementor-7890" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-bc35505 e-flex e-con-boxed e-con e-parent" data-id="bc35505" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-44ba7b6 elementor-widget elementor-widget-html" data-id="44ba7b6" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-23d3d65 elementor-widget elementor-widget-text-editor" data-id="23d3d65" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>Ever wondered where your data goes when you interact with AI cloud platforms? Or is it used to train future models? In this article, we’ll break down the data privacy policies of top AI platforms. You will also learn what to do to ensure your data is not used for training Large Language Models (LLM).</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-18af7ef elementor-widget elementor-widget-text-editor" data-id="18af7ef" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Major AI cloud providers have become increasingly transparent about their data usage policies &#8211; especially when it comes to training models. While most platforms, particularly those offering enterprise-level services, do not use your inputs and outputs for training by default, the fine print matters. Understanding how these services handle your data &#8211; and how you can maintain control &#8211; is essential.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e8dd97a elementor-widget elementor-widget-text-editor" data-id="e8dd97a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>In this article, we’ll break down the data privacy and model training policies of top AI platforms, including OpenAI, Google Gemini, Microsoft’s Azure OpenAI and Anthropic’s Claude. You’ll learn:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-33a2f78 elementor-widget elementor-widget-text-editor" data-id="33a2f78" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li>How AI platforms use your data and whether your data is used to train models by default</li><li>How to prevent AI from using your data opt, if needed</li><li>Where your data is stored (data residency), and</li><li>What compliance measures (like GDPR) apply</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-291cb3e elementor-widget elementor-widget-text-editor" data-id="291cb3e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Adopting AI isn’t just about prompt engineering or model performance. It’s also about knowing where your data goes—and how to ensure it stays under your control.</p><p><strong>Here’s what you need to know:</strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ecb1c2a elementor-widget elementor-widget-spacer" data-id="ecb1c2a" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-2add4a6 elementor-widget elementor-widget-heading" data-id="2add4a6" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">OpenAI – Data Usage and Privacy</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-cb6fb79 elementor-widget elementor-widget-text-editor" data-id="cb6fb79" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>OpenAI treats your data differently based on how you interact with its services:</p><p><strong>ChatGPT App (Web/Mobile)</strong></p><p>When you chat with ChatGPT, your conversations may be used to train AI models &#8211; unless you manually opt out. To prevent your data from being used:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-01b1f1c elementor-widget elementor-widget-text-editor" data-id="01b1f1c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li>Go to Settings → Data Controls → Improve the model for everyone and toggle it off.</li><li>Even with the opt-out, OpenAI stores chats for 30 days for abuse monitoring before deletion.</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1fb6e1a elementor-widget elementor-widget-image" data-id="1fb6e1a" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img fetchpriority="high" decoding="async" data-attachment-id="7897" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-2-2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" data-orig-size="602,407" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (2)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" tabindex="0" role="button" width="602" height="407" src="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" class="attachment-large size-large wp-image-7897" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png 602w, https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png 300w, https://inero-software.com/wp-content/uploads/2025/04/image-2-1-444x300.png 444w" sizes="(max-width: 602px) 100vw, 602px" data-attachment-id="7897" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-2-2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" data-orig-size="602,407" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (2)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-422566b elementor-widget elementor-widget-heading" data-id="422566b" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">OpenAI API and ChatGPT Enterprise</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-0be90ca elementor-widget elementor-widget-text-editor" data-id="0be90ca" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>If you&#8217;re a developer or a business using <strong>OpenAI&#8217;s API</strong> or <strong>ChatGPT Enterprise</strong>, there’s no need to opt out. By default, <strong>OpenAI does not use API or Enterprise data to train its models</strong>, and <strong>your data stays private</strong>. You don’t need to do anything to opt out &#8211; it’s already protected. You can choose to share data to help improve the model, but only if you want to.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d497044 elementor-widget elementor-widget-heading" data-id="d497044" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Data Residency  </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-bd96e19 elementor-widget elementor-widget-text-editor" data-id="bd96e19" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">OpenAI’s servers are mostly based in the </span><strong>United States</strong><span data-contrast="auto"><strong>,</strong> and currently, if you&#8217;re using the API directly, </span><strong>you can’t choose where your data is stored</strong><span data-contrast="auto"><strong>.</strong> That means your data is processed within OpenAI’s own infrastructure &#8211; protected by strong security, but </span><strong>not necessarily hosted in your country. </strong></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">However, there’s some progress for enterprise users. OpenAI recently introduced an option for </span><strong>eligible enterprise API</strong><b><span data-contrast="auto"> customers</span></b><span data-contrast="auto"> that allows data to be stored in </span><strong>Europe</strong><span data-contrast="auto">, provided there’s a specific agreement in place.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><strong>If regional data residency</strong><span data-contrast="auto"> is important for your business &#8211; say, for GDPR or internal compliance &#8211; you might want to consider using </span><strong>Azure OpenAI</strong><span data-contrast="auto">, which hosts OpenAI’s models on Microsoft’s cloud. With Azure, you can choose a region like </span><strong>Western Europe or Asia</strong><span data-contrast="auto"><strong>,</strong> and all data processing and storage will stay within that geography.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We’ll dive into Azure more in the next section &#8211; but in short: </span><strong>OpenAI handles your data securely</strong><span data-contrast="auto">, but for strict control over </span><i><span data-contrast="auto">where</span></i><span data-contrast="auto"> it lives, </span><b><span data-contrast="auto">a</span></b><strong> partner cloud service like Azure may be a better fit. </strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5c3e1ec elementor-widget elementor-widget-spacer" data-id="5c3e1ec" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0a6231f elementor-widget elementor-widget-heading" data-id="0a6231f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Google (Gemini) – Google’s Approach to Your Data </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-c7c8d68 elementor-widget elementor-widget-text-editor" data-id="c7c8d68" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Google’s foray into generative AI includes </span><strong>Gemini</strong><span data-contrast="auto">, a next-generation model that powers products like Google Gemini (the chatbot) and various enterprise AI offerings on Google Cloud. Here&#8217;s how they handle your data:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><h5><b><span data-contrast="auto">Gemini App</span></b><span data-ccp-props="{}"> </span></h5><div><span data-ccp-props="{}"> </span></div><p><strong>By default, Google does save your Gemini chat history to your account (much like search history) and may use it to improve their service. However, Google provides a “Gemini Activity” setting to control this</strong><span data-contrast="auto"><strong>.</strong> </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">To manage this:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-83c9f69 elementor-widget elementor-widget-text-editor" data-id="83c9f69" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Visit </span><strong>Gemini Activity</strong><span data-contrast="auto"> settings.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">Pause Gemini Activity to stop saving chats and prevent them from being used in </span><strong>AI model training data sources. </strong></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">You can also delete existing conversation history.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-13a61aa elementor-widget elementor-widget-text-editor" data-id="13a61aa" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><a href="https://support.google.com/gemini/answer/13594961#your_data"><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">T</span></span><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">urning off </span><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">Gemini</span></span><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink"> Activity</span></span></a><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0"> means </span></span><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0">your new chats </span><span class="NormalTextRun SCXW259565000 BCX0">won’t</span><span class="NormalTextRun SCXW259565000 BCX0"> be used to improve the</span><span class="NormalTextRun SCXW259565000 BCX0">ir</span> <span class="NormalTextRun SCXW259565000 BCX0">machine learning services</span></span><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0">, nor will they be seen by human reviewers, </span><span class="NormalTextRun SCXW259565000 BCX0">unless</span><span class="NormalTextRun SCXW259565000 BCX0"> you explicitly </span><span class="NormalTextRun SCXW259565000 BCX0">submit</span><span class="NormalTextRun SCXW259565000 BCX0"> them as feedback. This gives regular us</span><span class="NormalTextRun SCXW259565000 BCX0">ers a way </span><span class="NormalTextRun SCXW259565000 BCX0">to opt out, </span><span class="NormalTextRun SCXW259565000 BCX0">similar to</span><span class="NormalTextRun SCXW259565000 BCX0"> ChatGPT’s opt-out toggle.</span></span><span class="EOP SCXW259565000 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cb94ad1 elementor-widget elementor-widget-image" data-id="cb94ad1" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img decoding="async" data-attachment-id="7901" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" data-orig-size="712,332" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (1)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" tabindex="0" role="button" width="712" height="332" src="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" class="attachment-large size-large wp-image-7901" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/image-1.png 712w, https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png 300w, https://inero-software.com/wp-content/uploads/2025/04/image-1-643x300.png 643w" sizes="(max-width: 712px) 100vw, 712px" data-attachment-id="7901" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" data-orig-size="712,332" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (1)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-e57f550 elementor-widget elementor-widget-text-editor" data-id="e57f550" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">To stop saving your conversations, go to the </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">Activity </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">tab and toggle </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">Gemini Apps Activity</span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">. </span><span class="NormalTextRun SCXW161006688 BCX0">You can also </span><span class="NormalTextRun SCXW161006688 BCX0">delete</span><span class="NormalTextRun SCXW161006688 BCX0"> your past conversations.</span></span><span class="EOP SCXW161006688 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ca02a5a elementor-widget elementor-widget-heading" data-id="ca02a5a" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">API and Vertex AI </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-73c636b elementor-widget elementor-widget-text-editor" data-id="73c636b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0">If </span><span class="NormalTextRun SCXW147481227 BCX0">you’re</span><span class="NormalTextRun SCXW147481227 BCX0"> using Google Cloud’s </span></span><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0">Vertex AI</span></span><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0"> platform:</span></span><span class="EOP SCXW147481227 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5da483c elementor-widget elementor-widget-text-editor" data-id="5da483c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><span data-contrast="auto">Your prompts and outputs are </span><strong>not used to train AI models</strong><span data-contrast="auto"> without explicit permission.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><span data-contrast="auto">Data may be cached briefly (up to 24 hours) for performance but remains within your selected geographic region.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><span data-contrast="auto">Businesses can opt for a </span><strong>zero-retention policy</strong><span data-contrast="auto"> for maximum privacy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-15e5fa7 elementor-widget elementor-widget-heading" data-id="15e5fa7" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data residency  </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-6769505 elementor-widget elementor-widget-text-editor" data-id="6769505" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW242043066 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW242043066 BCX0">Data residency is a strong point for Google: you can choose which geographic region your AI service runs in (e.g. </span><span class="NormalTextRun SCXW242043066 BCX0">EU or U</span><span class="NormalTextRun SCXW242043066 BCX0">S data </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242043066 BCX0">centers</span><span class="NormalTextRun SCXW242043066 BCX0">), and Google will process and store data in that region to meet any data localization requirements.</span></span><span class="EOP SCXW242043066 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-dffefeb elementor-widget elementor-widget-spacer" data-id="dffefeb" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-e7c3b12 elementor-widget elementor-widget-heading" data-id="e7c3b12" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Microsoft Azure OpenAI – Enterprise Data Protection by Design </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-36c3a52 elementor-widget elementor-widget-heading" data-id="36c3a52" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Training Policy </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-657a095 elementor-widget elementor-widget-text-editor" data-id="657a095" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Microsoft’s Azure OpenAI Service lets companies use OpenAI’s models through the trusted Azure cloud platform. </span><strong>Privacy is a major selling point here</strong><span data-contrast="auto"><strong>.</strong> Microsoft is very explicit: </span><strong>any data you send into Azure OpenAI is not used to train the underlying models</strong><span data-contrast="auto"> or improve Microsoft’s or OpenAI’s services</span><span data-ccp-props="{&quot;134245418&quot;:true,&quot;134245529&quot;:true}"> .</span></p><p><span data-ccp-props="{&quot;134245418&quot;:true,&quot;134245529&quot;:true}"> </span></p><p><span data-contrast="none">Microsoft’s Azure OpenAI Service essentially hosts OpenAI’s models (GPT-4, GPT-3.5, etc.) within the Microsoft Azure cloud. Microsoft has specifically designed this service for enterprises that require strong privacy protections. Key aspects are:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-96820b8 elementor-widget elementor-widget-text-editor" data-id="96820b8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">Any data you input into Azure OpenAI – prompts, completions (model outputs), embeddings, fine-tuning data – is not used to train the AI models. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">Your inputs and outputs “are NOT available to other customers, are NOT available to OpenAI, and are NOT used to improve OpenAI models”. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">Microsoft only retains data as needed to provide the service and monitor for misuse. In fact, prompts and outputs on Azure are stored only temporarily (up to 30 days) by default, and solely for abuse detection purposes. After 30 days, those prompts are deleted. If even this temporary storage is a concern (say, for ultra-sensitive data), Microsoft offers a process called “modified abuse monitoring” where you can request that even the 30-day storage be bypassed, meaning no prompts are retained at all. Typically, you’d need approval for this exception, but it’s an option for high-security scenarios.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-5e2b615 elementor-widget elementor-widget-heading" data-id="5e2b615" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data Residency </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-82ccf7d elementor-widget elementor-widget-text-editor" data-id="82ccf7d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW93588553 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW93588553 BCX0">Because </span><span class="NormalTextRun SCXW93588553 BCX0">it’s</span><span class="NormalTextRun SCXW93588553 BCX0"> on Azure, you also </span><span class="NormalTextRun SCXW93588553 BCX0">benefit</span><span class="NormalTextRun SCXW93588553 BCX0"> from easily choosing the region and </span><span class="NormalTextRun SCXW93588553 BCX0">complying with</span><span class="NormalTextRun SCXW93588553 BCX0"> data residency requirements. When setting up Azure OpenAI, you deploy the service to an Azure region (for example, East US, West Europe, Southeast Asia, etc.). All processing and data storage for inference will occur within that region or its geographical boundary. So, if you deploy in Western Europe, your data </span><span class="NormalTextRun SCXW93588553 BCX0">isn’t</span><span class="NormalTextRun SCXW93588553 BCX0"> leaving Europe </span><span class="NormalTextRun SCXW93588553 BCX0">&#8211;</span><span class="NormalTextRun SCXW93588553 BCX0"> crucial for GDPR compliance. Azure itself meets </span><span class="NormalTextRun SCXW93588553 BCX0">numerous</span><span class="NormalTextRun SCXW93588553 BCX0"> compliance standards (SOC 2, ISO 27001, </span><span class="NormalTextRun SCXW93588553 BCX0">etc.)</span><span class="NormalTextRun SCXW93588553 BCX0">, and these extend to Azure OpenAI as an Azure service.</span></span><span class="EOP SCXW93588553 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cea2902 elementor-widget elementor-widget-spacer" data-id="cea2902" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0013609 elementor-widget elementor-widget-heading" data-id="0013609" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Anthropic (Claude) – A Privacy-First AI Assistant </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-6f1b8b4 elementor-widget elementor-widget-heading" data-id="6f1b8b4" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Training Policy </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-988001e elementor-widget elementor-widget-text-editor" data-id="988001e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW126360551 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW126360551 BCX0">Anthropic, the company behind the Claude AI assistant (Claude 2 and newer versions), has emphasized a privacy-conscious approach from the outset. </span><span class="NormalTextRun SCXW126360551 BCX0">Anthropic adopts an opt-in approach:</span></span><span class="EOP SCXW126360551 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3f7e219 elementor-widget elementor-widget-text-editor" data-id="3f7e219" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">By default, Anthropic does not use your conversations or data to train its models. This applies to both their commercial offerings (</span><a href="https://privacy.anthropic.com/en/collections/10663361-commercial-customers"><span data-contrast="none">Claude for Work, Anthropic API</span></a><span data-contrast="none">)</span> <span data-contrast="none">and consumer products (Claude Free, Claude Pro)</span> <span data-contrast="none">– your prompts and Claude’s responses aren’t automatically used for model training. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">They only use data if you deliberately opt-in, such as by providing explicit feedback. For instance, if you click a thumbs-up/down in a Claude interface or send data to their feedback channels, you’re essentially saying “you can learn from this”.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-00d6994 elementor-widget elementor-widget-text-editor" data-id="00d6994" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW11925797 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW11925797 BCX0">For enterprise clients, Anthropic offers Claude Team/Enterprise, which not only guarantees no training on your data but also provides admin controls. One such feature is custom data retention settings. By default, Anthropic’s systems might </span><span class="NormalTextRun SCXW11925797 BCX0">retain</span><span class="NormalTextRun SCXW11925797 BCX0"> your inputs/outputs indefinitely for your account (though not for training). However, Claude Enterprise admins can set a retention policy – for example, you might set it to </span><span class="NormalTextRun SCXW11925797 BCX0">delete</span><span class="NormalTextRun SCXW11925797 BCX0"> all conversation data after 30 days, 60 days, etc., with 30 days being the current minimum. These controls aim to support compliance with regulations like GDPR.</span></span><span class="EOP SCXW11925797 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5a24ccc elementor-widget elementor-widget-heading" data-id="5a24ccc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data Residency  </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-3cf1dac elementor-widget elementor-widget-text-editor" data-id="3cf1dac" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW47979688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47979688 BCX0">Anthropic is a newer player, and currently, when you use their API directly, you </span><span class="NormalTextRun SCXW47979688 BCX0">don’t</span><span class="NormalTextRun SCXW47979688 BCX0"> explicitly choose a data region – </span><span class="NormalTextRun SCXW47979688 BCX0">it’s</span> <span class="NormalTextRun SCXW47979688 BCX0">likely hosted</span><span class="NormalTextRun SCXW47979688 BCX0"> in the US by Anthropic (or </span><span class="NormalTextRun SCXW47979688 BCX0">possibly through</span><span class="NormalTextRun SCXW47979688 BCX0"> cloud providers like AWS in the US region). However, Anthropic models are also available through partners, which can help with data residency. For example, Anthropic’s Claude is offered via Amazon Bedrock (AWS’s AI service) and via Google Cloud Vertex AI. If you use Claude through one of these platforms, you can take advantage of AWS’s or Google’s region controls.</span></span><span class="EOP SCXW47979688 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-a2a60c8 elementor-widget elementor-widget-spacer" data-id="a2a60c8" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-de688f7 elementor-widget elementor-widget-heading" data-id="de688f7" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-9f1b51c elementor-widget elementor-widget-text-editor" data-id="9f1b51c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Understanding the </span><strong>data collection practices of LLM providers</strong><span data-contrast="auto"> is crucial for<b> </b></span><strong>AI compliance</strong><span data-contrast="auto">, customer trust, and corporate governance. Whether you&#8217;re focused on compliance, customer trust, or internal data governance, these insights help you make informed decisions. Choose providers that align with your privacy values &#8211; and always review your settings.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Here&#8217;s a comparison of major platforms:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4597621 elementor-widget elementor-widget-html" data-id="4597621" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
  }
  th, td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    vertical-align: top;
  }
  th {
    background-color: #f2f2f2;
  }
  a {
    color: #1C244B;
    text-decoration: underline;
  }
</style>

<table>
  <thead>
    <tr>
      <th>Provider</th>
      <th>Default Data Training</th>
      <th>Web App Setting</th>
      <th>Data Residency Options</th>
      <th>GDPR/CCPA Compliance</th>
      <th>Privacy Policy</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>OpenAI</td>
      <td>No (API)</td>
      <td>Opt-out available</td>
      <td>No; (unless used via Azure Microsoft)</td>
      <td>Yes</td>
      <td><a href="https://openai.com/policies/privacy-policy" target="_blank">Consumer privacy</a></td>
    </tr>
    <tr>
      <td>Google</td>
      <td>No (Cloud + Gemini)</td>
      <td>No training by default</td>
      <td>Broad region control</td>
      <td>Yes</td>
      <td>
        <a href="https://policies.google.com/privacy" target="_blank">Enterprise privacy</a>, 
        <a href="https://www.google.com/intl/en_us/gemini/privacy" target="_blank">Gemini privacy</a>, 
        <a href="https://cloud.google.com/vertex-ai/docs/general/privacy-overview" target="_blank">Vertex AI</a>
      </td>
    </tr>
    <tr>
      <td>Azure</td>
      <td>No</td>
      <td>N/A</td>
      <td>Full regional control</td>
      <td>Yes</td>
      <td><a href="https://privacy.microsoft.com/en-us/privacystatement" target="_blank">Azure, OpenAI privacy</a></td>
    </tr>
    <tr>
      <td>Anthropic</td>
      <td>No</td>
      <td>No training by default</td>
      <td>No (unless used via partners)</td>
      <td>Yes</td>
      <td>
        <a href="https://www.anthropic.com/legal/privacy" target="_blank">API users</a>, 
        <a href="https://claude.ai/privacy" target="_blank">Claude.ai users</a>
      </td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-5234314 elementor-widget elementor-widget-text-editor" data-id="5234314" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0">For maximum privacy and control, </span></span><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0"><b>local deployment</b></span></span><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0"><b> </b>(on-premises models) is always an alternative. This avoids cloud storage concerns entirely.</span><span class="NormalTextRun SCXW227920897 BCX0"> You can read more about local deployment </span></span><a class="Hyperlink SCXW227920897 BCX0" href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW227920897 BCX0" data-ccp-charstyle="Hyperlink">here</span></span></a><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0">.</span></span><span class="EOP SCXW227920897 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7c7244d elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="7c7244d" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<div class="elementor-cta">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/03/cta-1903-1030x579.png);" role="img" aria-label="cta 1903"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Let's talk about AI agents 					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Ready to bring AI into your business? Let us help you get started.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<a class="elementor-cta__button elementor-button elementor-size-" href="https://inero-software.com/contact-us/">
						Contact us					</a>
					</div>
							</div>
						</div>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/">AI User Privacy: An Analysis of Platform Policies</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7890</post-id>	</item>
		<item>
		<title>Top Lightweight LLMs for Local Deployment</title>
		<link>https://inero-software.com/top-lightweight-llms-for-local-deployment/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Thu, 17 Apr 2025 09:50:46 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[Lightweight LLMs]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7843</guid>

					<description><![CDATA[<p>In this post, we’ll explore several top open-source lightweight LLMs and how to run them on a local Windows PC—whether CPU-only or with a limited GPU—for document processing tasks. </p>
<p>Artykuł <a href="https://inero-software.com/top-lightweight-llms-for-local-deployment/">Top Lightweight LLMs for Local Deployment</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7843" class="elementor elementor-7843" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-cc31ada e-flex e-con-boxed e-con e-parent" data-id="cc31ada" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-2485c29 elementor-widget elementor-widget-html" data-id="2485c29" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
					</div>
				</div>
				<div class="elementor-element elementor-element-d3520b4 elementor-widget elementor-widget-text-editor" data-id="d3520b4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h5><strong><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0">Running large language models (LLMs) on your own hardware has become increasingly </span></span></span><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0">feasible</span></span></span><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0"> thanks to </span></span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">lightweight LLMs</span><span class="NormalTextRun SCXW35608661 BCX0">—models w</span><span class="NormalTextRun SCXW35608661 BCX0">ith</span> <span class="NormalTextRun SCXW35608661 BCX0">relatively small</span><span class="NormalTextRun SCXW35608661 BCX0"> parameter counts that deliver </span><span class="NormalTextRun SCXW35608661 BCX0">strong performance</span><span class="NormalTextRun SCXW35608661 BCX0"> without requiring server-grade GPUs.</span><span class="NormalTextRun SCXW35608661 BCX0"> In this post, </span><span class="NormalTextRun SCXW35608661 BCX0">we’ll</span><span class="NormalTextRun SCXW35608661 BCX0"> explore several top open-source lightweight LLMs and how to run them on a local Windows PC—whether CPU-only or with a limited GPU—for document processing tasks.</span> </span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">We also include a </span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">benchmark comparing the models</span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0"> in terms of </span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">accuracy and inference speed</span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">, helping you choose the right model for your local environment and use case.</span></span><span class="EOP SCXW35608661 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:299,&quot;335559739&quot;:299}"> </span></strong></h5>						</div>
				</div>
				<div class="elementor-element elementor-element-10359f9 elementor-widget elementor-widget-heading" data-id="10359f9" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">What Are Lightweight LLMs (and Why Run Them Locally)? </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-621d40f elementor-widget elementor-widget-text-editor" data-id="621d40f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW177302101 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177302101 BCX0">“Lightweight” LLMs are models typically in the range of ~1–8 billion parameters – far smaller than GPT-3 class models – often optimized to run on a single GPU or even CPU. They are usually released as open models with freely available weights. These models trade some raw power for efficiency, but recent research and clever engineering (better data, distilled training, efficient attention mechanisms, etc.) have dramatically improved their capabilities. Many can now match or beat much larger models on specific benchmarks</span><span class="NormalTextRun SCXW177302101 BCX0">.</span></span><span class="EOP SCXW177302101 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-81497fe elementor-widget elementor-widget-text-editor" data-id="81497fe" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Local deployment of such models is valuable for several reasons:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Privacy &amp; Security:</strong><span data-contrast="auto"> All data stays on your machine, which is crucial for confidential documents like insurance contracts. You’re not sending sensitive text to a third-party API.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Cost Savings:</strong><span data-contrast="auto"> Once downloaded, local models run </span><strong>for free</strong><span data-contrast="auto"> – no API usage fees or cloud compute bills. This can make a big difference if you process large volumes of documents regularly.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Latency &amp; Offline Access:</strong><span data-contrast="auto"> Local inference eliminates network latency. Responses can be near-instant on a GPU, and you can operate entirely offline. This is useful for on-site workflows or when internet access is restricted.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Customization:</strong><span data-contrast="auto"> With local models you have full control – you can adjust parameters, prompts, or fine-tune models to better fit your domain (e.g. insurance data) without vendor limits.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><p><span data-contrast="auto">In short, lightweight LLMs put AI capabilities directly in your hands, on hardware you own. Next, we’ll compare some of the leading open models that are well-suited for local document processing.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6e958d1 elementor-widget elementor-widget-heading" data-id="6e958d1" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Comparing Top Lightweight LLMs </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-adbf2c8 elementor-widget elementor-widget-text-editor" data-id="adbf2c8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW101152181 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW101152181 BCX0">Lightweight open-source large language models (LLMs) are becoming a practical choice for organizations looking to run AI workloads locally. They offer a strong balance between performance, speed, and resource requirements—making them ideal for document summarization, extraction, and classification without relying on cloud infrastructure. </span></span><span class="EOP SCXW101152181 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-330c9fe e-flex e-con-boxed e-con e-parent" data-id="330c9fe" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-73949bc elementor-widget elementor-widget-text-editor" data-id="73949bc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">We’ll focus on the following open-source models (each with downloadable checkpoints) that have a good reputation for quality relative to their size:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6703794 elementor-widget elementor-widget-text-editor" data-id="6703794" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong>Llama 3.1</strong><span data-contrast="auto"> – 8B parameters (Meta AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>StableLM Zephyr</strong><span data-contrast="auto"> – 3B parameters (Stability AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Llama 3.2</strong><span data-contrast="auto"> – 1B/3B parameters (Meta AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Mistral</strong><span data-contrast="auto"> – 7B parameters (Mistral AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><strong>Gemma 3</strong><span data-contrast="auto"> – 1B and 4B variants (Google DeepMind)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><strong>DeepSeek R1</strong><span data-contrast="auto"> – 1.5B and 7B variants (DeepSeek AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><strong>Phi-4 Mini</strong><span data-contrast="auto"> – 3.8B parameters (Microsoft)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="8" data-aria-level="1"><strong>TinyLlama</strong><span data-contrast="auto"> – 1.1B parameters (community project)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-f98ca55 elementor-widget elementor-widget-text-editor" data-id="f98ca55" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><span data-contrast="auto">These models range from very small (under 1 GB on disk) to mid-sized (~5 GB). All can be run in inference mode on a 16 GB GPU (often even in half-precision or 4-bit quantized form) and many are workable on CPU with enough RAM and patience. Table 1 summarizes their characteristics:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-71cd074 elementor-widget elementor-widget-html" data-id="71cd074" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  @import url('https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap');

  .model-table {
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
    border-collapse: collapse;
    width: 100%;
  }

  .model-table th, .model-table td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    color: #1C244B;
  }

  .model-table th {
    background-color: #f2f2f2;
  }
</style>

<table class="model-table">
  <thead>
    <tr>
      <th>Model</th>
      <th>Size on Disk (quantized)</th>
      <th>Max Context</th>
      <th>Licence</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Llama 3.1 (8B)</td>
      <td>4.9GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>StableLM Zephyr (3B)</td>
      <td>1.6GB</td>
      <td>4k tokens</td>
      <td>Only non-commercial use</td>
    </tr>
    <tr>
      <td>Llama 3.2 (3B)</td>
      <td>2.0GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>Mistral (7B)</td>
      <td>4.1GB</td>
      <td>32k tokens</td>
      <td>Open-source (Apache 2.0)</td>
    </tr>
    <tr>
      <td>Gemma 3 (4B)</td>
      <td>3.3GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>Gemma 3 (1B)</td>
      <td>0.8GB</td>
      <td>32k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>DeepSeek R1 (7B)</td>
      <td>4.7GB</td>
      <td>128k tokens</td>
      <td>Open-source (MIT licence)</td>
    </tr>
    <tr>
      <td>DeepSeek R1 (1.5B)</td>
      <td>1.1GB</td>
      <td>128k tokens</td>
      <td>Open-source (MIT licence)</td>
    </tr>
    <tr>
      <td>Phi-4 Mini (3.8B)</td>
      <td>2.5GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>TinyLlama (1.1B)</td>
      <td>0.6GB</td>
      <td>2k tokens</td>
      <td>Open-source</td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-55c06b4 elementor-widget elementor-widget-text-editor" data-id="55c06b4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW254867370 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW254867370 BCX0">Table 1:</span></span><span class="TextRun SCXW254867370 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW254867370 BCX0"> Lightweight LLMs for local use – model sizes a</span><span class="NormalTextRun SCXW254867370 BCX0">nd</span> <span class="NormalTextRun SCXW254867370 BCX0">maximum</span><span class="NormalTextRun SCXW254867370 BCX0"> context windo</span><span class="NormalTextRun SCXW254867370 BCX0">w</span><span class="NormalTextRun SCXW254867370 BCX0">.</span></span><span class="EOP SCXW254867370 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-58e51e9 elementor-widget elementor-widget-text-editor" data-id="58e51e9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW7653520 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW7653520 BCX0">Notes:</span></span></strong><span class="TextRun SCXW7653520 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW7653520 BCX0"> “Max Context” is the maximum sequence length (tokens) the model can process in one go. </span></span><span class="EOP SCXW7653520 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-223eda5 elementor-widget elementor-widget-text-editor" data-id="223eda5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">Next, </span><span class="NormalTextRun SCXW99345828 BCX0">let’s</span><span class="NormalTextRun SCXW99345828 BCX0"> look at each model’s </span></span><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">pros and cons</span></span><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">, especially in the context of document tasks:</span></span><span class="EOP SCXW99345828 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7192f01 elementor-widget elementor-widget-text-editor" data-id="7192f01" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Llama 3.1 (8B)</strong><span data-contrast="auto"><strong>:</strong> Powerful general-purpose model; moderate size and strong multilingual capabilities. Heavy for CPU-only systems; requires chunking for long documents.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>StableLM Zephyr (3B)</strong><span data-contrast="auto"><strong>:</strong> Ultra-lightweight, good for basic QA/extraction. Limited by small parameter count and commercial license restrictions.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Llama 3.2 (3B)</strong><span data-contrast="auto">: Excellent summarization and retrieval; long context support (128k tokens). Smaller size affects complex reasoning accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Mistral (7B)</strong><span data-contrast="auto"><strong>:</strong> Best overall performer for its size; highly efficient inference. Ideal for detailed summarization tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><strong>Gemma 3 (4B/1B)</strong><span data-contrast="auto">: Offers multimodal capabilities and extensive multilingual support. The 4B model balances capability and speed; the 1B model best suited for simple tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><strong>DeepSeek R1 (7B/1.5B)</strong><span data-contrast="auto">: Balanced efficiency and comprehension for general NLP tasks; limited complex reasoning compared to Mistral.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><strong>Phi-4 Mini (3.8B)</strong><span data-contrast="auto">: Exceptional reasoning, math, and logical capabilities; perfect for analytical document processing. English-focused.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="8" data-aria-level="1"><strong>TinyLlama (1.1B)</strong><span data-contrast="auto">: Extremely lightweight; suitable for basic text extraction/classification tasks. Limited contextual understanding.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-906c9d8 elementor-widget elementor-widget-text-editor" data-id="906c9d8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW259074413 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259074413 BCX0">The models reviewed above cover a wide range of sizes and capabilities. Larger variants like Llama 3.1 and Mistral perform well on complex summarization and multilingual tasks but are less suited for CPU-only setups. Mid-sized models such as Llama 3.2 and Gemma 3 (4B) handle long inputs efficiently with reasonable performance. Smaller models, including </span><span class="NormalTextRun SpellingErrorV2Themed SCXW259074413 BCX0">TinyLlama</span><span class="NormalTextRun SCXW259074413 BCX0"> and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW259074413 BCX0">StableLM</span><span class="NormalTextRun SCXW259074413 BCX0"> Zephyr, are lightweight and fast, making them practical for basic extraction or classification tasks.</span></span><span class="EOP SCXW259074413 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-013ecbc elementor-widget elementor-widget-heading" data-id="013ecbc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Models Benchmarking: Document Extraction and Summarization </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f583b4c elementor-widget elementor-widget-text-editor" data-id="f583b4c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW65580225 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW65580225 BCX0">Here we outline a simple </span><span class="NormalTextRun SCXW65580225 BCX0">model </span><span class="NormalTextRun SCXW65580225 BCX0">benchmarking plan covering t</span><span class="NormalTextRun SCXW65580225 BCX0">wo</span><span class="NormalTextRun SCXW65580225 BCX0"> common document-processing tasks:</span></span><span class="EOP SCXW65580225 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-236155a elementor-widget elementor-widget-text-editor" data-id="236155a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li><strong> Information Extraction:</strong><span data-contrast="auto"> We evaluated how well each model can extract specific fields from a policy or certificate. Specifically, we prompted each model to find the </span><b><span data-contrast="auto">p</span></b><strong>olicy number, insured name</strong><span data-contrast="auto"><strong>,</strong> VAT ID, address and insurance period in the document text and return the structured output &#8211; clean JSON response with all the needed values.</span></li><li><strong> Summarization: </strong><span data-contrast="auto">Each model generated a concise summary of an insurance policy, covering key points such as coverage, exclusions, and conditions.We rated the summaries on clarity, correctness, factual accuracy and readability and penalized heavily fabricating information.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-02421da elementor-widget elementor-widget-text-editor" data-id="02421da" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentStart SCXW43958002 BCX0">We used 11 document</span><span class="NormalTextRun SCXW43958002 BCX0">s</span><span class="NormalTextRun SCXW43958002 BCX0"> and</span><span class="NormalTextRun SCXW43958002 BCX0"> </span><span class="NormalTextRun SCXW43958002 BCX0">ran all t</span><span class="NormalTextRun SCXW43958002 BCX0">ests using </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW43958002 BCX0">Ollama</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">(</a></span><span class="NormalTextRun SCXW43958002 BCX0">you can read about </span><span class="NormalTextRun SCXW43958002 BCX0">running model with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW43958002 BCX0">Ollama</span> <span class="NormalTextRun CommentStart SCXW43958002 BCX0">here</span><span class="NormalTextRun SCXW43958002 BCX0">)</span><span class="NormalTextRun SCXW43958002 BCX0">.</span><span class="NormalTextRun SCXW43958002 BCX0"> </span><span class="NormalTextRun SCXW43958002 BCX0">The benchmarks were performed on a PC equipped with an</span><span class="NormalTextRun SCXW43958002 BCX0"> NVIDIA </span><span class="NormalTextRun SCXW43958002 BCX0">GeForce RTX 2060 </span><span class="NormalTextRun SCXW43958002 BCX0">and </span><span class="NormalTextRun SCXW43958002 BCX0">6</span><span class="NormalTextRun SCXW43958002 BCX0"> GB </span><span class="NormalTextRun SCXW43958002 BCX0">V</span><span class="NormalTextRun SCXW43958002 BCX0">RAM.</span> <span class="NormalTextRun SCXW43958002 BCX0">To ensure consistent results, each model was run with </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">temperature set to 0</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> for the extraction task (to produce deterministic outputs), and with a fixed </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">temperature of 0.7</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> for summarization. For the extraction task, we also used </span></span><a class="Hyperlink SCXW43958002 BCX0" href="https://ollama.com/blog/structured-outputs" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW43958002 BCX0" data-ccp-charstyle="Hyperlink">structured outputs</span></span></a><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">:</span> </span><span class="EOP SCXW43958002 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f1a279a elementor-widget elementor-widget-text-editor" data-id="f1a279a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre> <br /><br /><span data-contrast="none">{</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"model"</span><span data-contrast="none">: </span><span data-contrast="none">"deepseek-r1:7b"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"prompt"</span><span data-contrast="none">: </span><span data-contrast="none">"You are an assistant that extracts insurance-related information from a given input text. You must extract and return only the following fields: - policy_number,- insurance_period,- insured (company or person name),- nip (tax identification number),- address (of the insured). Return the output as a **clean JSON object** — not as a string, not inside quotes, and without any commentary. If a field is missing, use 'Not found'. Document text: "</span><span data-contrast="none">,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span><br /><br /><span data-contrast="none">    </span><span data-contrast="none">"stream"</span><span data-contrast="none">: </span><b><span data-contrast="none">false</span></b><span data-contrast="none">,</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"format"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"object"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"properties"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"policy_number"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_start"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_end"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_nip"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_address"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      }</span> <br /><span data-contrast="none">    },</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"required"</span><span data-contrast="none">: [</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"policy_number"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_start"</span><span data-contrast="none">, </span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_end"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_nip"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_address"</span> <br /><span data-contrast="none">    ]</span> <br /><span data-contrast="none">  }</span> <br /><span data-contrast="none">}</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-a6fbca0 elementor-widget elementor-widget-image" data-id="a6fbca0" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img decoding="async" data-attachment-id="7846" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/attachment/111553/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/111553.png" data-orig-size="1154,649" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="111553" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" class="attachment-large size-large wp-image-7846" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/111553-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/111553-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/111553.png 1154w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7846" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/attachment/111553/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/111553.png" data-orig-size="1154,649" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="111553" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-c923f73 elementor-widget elementor-widget-text-editor" data-id="c923f73" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW85460195 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW85460195 BCX0">Examples of insurance </span><span class="NormalTextRun SCXW85460195 BCX0">certifacates</span><span class="NormalTextRun SCXW85460195 BCX0">.</span></span><span class="EOP SCXW85460195 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-e9e7e62 elementor-widget elementor-widget-text-editor" data-id="e9e7e62" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">The table below presents the benchmark results.</span></span></strong> <span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">Extraction accuracy</span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0"> refers to the number of documents (out of 11) where the model successfully extracted all key fields. </span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">Token/sec</span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"> <span class="NormalTextRun SCXW36022441 BCX0">indicates</span><span class="NormalTextRun SCXW36022441 BCX0"> the model’s inference speed — how quickly it generates responses.</span></span><span class="EOP SCXW36022441 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-e5f35c8 elementor-widget elementor-widget-html" data-id="e5f35c8" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  @import url('https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap');

  .model-table {
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
    border-collapse: collapse;
    width: 100%;
  }

  .model-table th, .model-table td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    color: #1C244B;
  }

  .model-table th {
    background-color: #f2f2f2;
  }

  .green-bg {
    background-color: #DFF0D8;
  }

  .red-bg {
    background-color: #F2DEDE;
  }
</style>

<table class="model-table">
  <thead>
    <tr>
      <th>Model</th>
      <th>Summarization</th>
      <th>Extraction Accuracy</th>
      <th>Tokens/sec</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Llama 3.1 (8B)</td>
      <td class="green-bg">High-quality, no hallucinations</td>
      <td>10/11</td>
      <td>13.49</td>
    </tr>
    <tr>
      <td>StableLM 3B</td>
      <td class="red-bg">Average quality, typos/hallucinations</td>
      <td>4/11</td>
      <td>56.51</td>
    </tr>
    <tr>
      <td>Llama 3.2 (3B)</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>8/11</td>
      <td>49.49</td>
    </tr>
    <tr>
      <td>Mistral 7B</td>
      <td>Extensive summary, factually correct</td>
      <td>8/11</td>
      <td>29.01</td>
    </tr>
    <tr>
      <td>Gemma 3 4B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>10/11</td>
      <td>13.37</td>
    </tr>
    <tr>
      <td>Gemma 3 1B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>4/11</td>
      <td>73.46</td>
    </tr>
    <tr>
      <td>DeepSeek 7B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>6/11</td>
      <td>16.39</td>
    </tr>
    <tr>
      <td>DeepSeek 1.5B</td>
      <td class="red-bg">Very poor, frequent hallucinations/errors</td>
      <td>0/11</td>
      <td>66.45</td>
    </tr>
    <tr>
      <td>Phi-4 Mini 3.8B</td>
      <td>Very concise summaries, factually correct</td>
      <td>9/11</td>
      <td>39.31</td>
    </tr>
    <tr>
      <td>TinyLlama 1.1B</td>
      <td class="red-bg">Poor quality, severe hallucinations</td>
      <td>2/11</td>
      <td>107.34</td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-4f30579 elementor-widget elementor-widget-text-editor" data-id="4f30579" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW220458249 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW220458249 BCX0">Table 2: </span><span class="NormalTextRun SCXW220458249 BCX0">B</span><span class="NormalTextRun SCXW220458249 BCX0">enchmarking results.</span></span><span class="TextRun SCXW220458249 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW220458249 BCX0"> </span></span><span class="EOP SCXW220458249 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-1046393 elementor-widget elementor-widget-image" data-id="1046393" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7847" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/lightweight-llm-scatterplot/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png" data-orig-size="1968,1180" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="lightweight-llm-scatterplot" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" tabindex="0" role="button" width="1030" height="618" src="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" class="attachment-large size-large wp-image-7847" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png 300w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-768x460.png 768w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1536x921.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-500x300.png 500w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png 1968w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7847" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/lightweight-llm-scatterplot/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png" data-orig-size="1968,1180" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="lightweight-llm-scatterplot" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-704e9c5 elementor-widget elementor-widget-text-editor" data-id="704e9c5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0">This scatterplot visualizes the </span></span><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0">trade-off between extraction accuracy and inference speed</span></span><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0"> (measured in tokens per second)</span></span><span class="EOP SCXW241422309 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5527166 elementor-widget elementor-widget-text-editor" data-id="5527166" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">The benchmarking results reveal significant variations among the tested models. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Bottom-right</strong><span data-contrast="auto"> models &#8211; </span><strong>Llama 3.1 (8B), Gemma 3 (4B)</strong><span data-contrast="auto">, and </span><strong>Phi-4 Mini (3.8B)</strong> <span data-contrast="auto">&#8211; </span><span data-contrast="auto">excel in summarization quality and extraction accuracy, consistently providing concise and accurate outputs. Phi-4 Mini seems to offer a good trade-off between speed and accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Mistral 7B, DeepSeek 7B, Llama 3.2</strong><span data-contrast="auto"> generate detailed and informative summaries, though their extraction performance is more moderate.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">On the other hand, </span><strong>smaller models</strong> <span data-contrast="auto">(on the top-left side of the chart) like </span><strong><i>StableLM Zephyr (3B), Gemma 3 (1B)</i> and <i>TinyLlama</i></strong><i><span data-contrast="auto"> (1.1B)</span></i><span data-contrast="auto"> show significantly weaker extraction accuracy and are prone to frequent hallucinations. However, they benefit from faster inference times. Their limited context windows (e.g., 4k tokens) may contribute to these shortcomings. Overall, they may be suitable for only very basic tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1ac20ae elementor-widget elementor-widget-heading" data-id="1ac20ae" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Choosing the Right Model for Your Needs </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-11e1bfc elementor-widget elementor-widget-text-editor" data-id="11e1bfc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">When selecting a language model for document extraction or summarization, </span><span class="NormalTextRun SCXW204701935 BCX0">it’s</span><span class="NormalTextRun SCXW204701935 BCX0"> all about balancing </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">accuracy</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">, </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">speed</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">, and </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">hardware constraints</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">. Below is a quick breakdown to help you pick the best fit—whether you need high precision, fast inference, or something lightweight for basic tasks.</span></span><span class="EOP SCXW204701935 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-689718c elementor-widget elementor-widget-text-editor" data-id="689718c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>High Accuracy &amp; Reasonable Speed:</strong><span data-contrast="auto"> Choose </span><strong>Phi-4 Mini (3.8B), Gemma 3 (4B)</strong><span data-contrast="auto">, or </span><strong>Llama 3.1 (8B)</strong><span data-contrast="auto"> for robust extraction and summarization accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Fast Inference &amp; Moderate Accuracy:</strong><span data-contrast="auto"> Opt for </span><strong>Llama 3.2 (3B)</strong><span data-contrast="auto"> or </span><strong>StableLM Zephyr (3B)</strong><span data-contrast="auto"> for simpler tasks on limited hardware.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Balanced Performance (Accuracy-Speed Tradeoff): Mistral (7B)</strong><span data-contrast="auto"> provides strong general-purpose capability suitable for detailed document summarization tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Low Resource Environments (Basic Tasks):</strong><span data-contrast="auto"> Consider </span><strong>TinyLlama (1.1B)</strong><span data-contrast="auto"> for quick extraction or classification on minimal hardware if accuracy isn&#8217;t critical.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-ee4c212 elementor-widget elementor-widget-heading" data-id="ee4c212" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-510ec3a elementor-widget elementor-widget-text-editor" data-id="510ec3a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW44846787 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW44846787 BCX0">Lightweight LLMs are increasingly </span><span class="NormalTextRun SCXW44846787 BCX0">viable</span><span class="NormalTextRun SCXW44846787 BCX0"> solutions for local deployment, particularly in document-intensive industries such as insurance. Models such as Phi-4 Mini, Gemma 3 (4B), and Mistral 7B provide </span><span class="NormalTextRun SCXW44846787 BCX0">strong performance</span><span class="NormalTextRun SCXW44846787 BCX0"> in summarization, extraction, and classification tasks. Carefully balancing model size, inference speed, and accuracy ensures </span><span class="NormalTextRun SCXW44846787 BCX0">optimal</span><span class="NormalTextRun SCXW44846787 BCX0"> outcomes, empowering organizations with affordable, private, and responsive AI solutions directly on owned hardware.</span></span><span class="EOP SCXW44846787 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8874a86 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="8874a86" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/optimization-of-back-office-processes-with-ai-agent-implementation-a-practical-example/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/03/cta-1903-1030x579.png);" role="img" aria-label="cta 1903"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						This might interest you					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Optimization of Back-Office Processes with AI Agent Implementation: A Practical Example					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Read the full text					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/top-lightweight-llms-for-local-deployment/">Top Lightweight LLMs for Local Deployment</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7843</post-id>	</item>
		<item>
		<title>Deploying LLMs Locally: A Guide to Ollama and LM Studio</title>
		<link>https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Fri, 04 Apr 2025 08:53:42 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[CLI Tool]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[LM Studio]]></category>
		<category><![CDATA[Local deployment]]></category>
		<category><![CDATA[Ollama]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7692</guid>

					<description><![CDATA[<p>Whether you’re building a custom chatbot, agent, an AI-powered code assistant, or using AI to analyse documents offline, local deployment empowers you to experiment and innovate without relying on external services. </p>
<p>Artykuł <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">Deploying LLMs Locally: A Guide to Ollama and LM Studio</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7692" class="elementor elementor-7692" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-139e1f8 e-flex e-con-boxed e-con e-parent" data-id="139e1f8" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-2474df1 elementor-widget elementor-widget-html" data-id="2474df1" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-29e8b23 elementor-widget elementor-widget-text-editor" data-id="29e8b23" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4><span class="TextRun SCXW12802383 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW12802383 BCX0">Local deployment of Large Language Models (LLMs) is becoming increasingly popular among developers, tech enthusiasts, and professionals in industries like insurance and transport. Unlike cloud-based APIs, local LLM deployment offers greater privacy, offline accessibility, and complete control over resource optimization and inference performance.</span></span><span class="EOP SCXW12802383 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h4>						</div>
				</div>
				<div class="elementor-element elementor-element-d13f5fb elementor-widget elementor-widget-text-editor" data-id="d13f5fb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW230118114 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW230118114 BCX0">Running models like Llama 2 or Mistral directly on your hardware means your data stays on your machine — ideal for privacy-sensitive tasks such as processing insurance documents or working with proprietary transport data. There are no recurring API costs, and the performance depends solely on your system. Whether </span><span class="NormalTextRun SCXW230118114 BCX0">you&#8217;re</span><span class="NormalTextRun SCXW230118114 BCX0"> building a custom chatbot, </span><span class="NormalTextRun SCXW230118114 BCX0">agent, </span><span class="NormalTextRun SCXW230118114 BCX0">an AI-powered code assistant, or using AI to </span><span class="NormalTextRun SCXW230118114 BCX0">analyse</span><span class="NormalTextRun SCXW230118114 BCX0"> documents offline, local deployment empowers you to experiment and innovate without relying on external services.</span></span><span class="EOP SCXW230118114 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-313a919 elementor-widget elementor-widget-text-editor" data-id="313a919" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0">In this guide, </span><span class="NormalTextRun SCXW97631897 BCX0">we&#8217;ll</span><span class="NormalTextRun SCXW97631897 BCX0"> explore two powerful tools that make this possible: </span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW97631897 BCX0"><b>Ollama</b></span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0"> and </span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0">. </span><span class="NormalTextRun SCXW97631897 BCX0">We&#8217;ll</span><span class="NormalTextRun SCXW97631897 BCX0"> walk through installation, usage, and customization, helping you pick the best </span><span class="NormalTextRun SCXW97631897 BCX0">option</span><span class="NormalTextRun SCXW97631897 BCX0"> for your goals.</span></span><span class="EOP SCXW97631897 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-66f4910 elementor-widget elementor-widget-heading" data-id="66f4910" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Getting Started with Ollama (CLI Tool) </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-55390e0 elementor-widget elementor-widget-text-editor" data-id="55390e0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW101755402 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">Ollama</span><span class="NormalTextRun SCXW101755402 BCX0"> is a lightweight, open-source command-line tool for running LLMs locally. It acts as a model manager and runtime, making it easy to download and execute open-source models (like Llama 2, Mistral, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">CodeLlama</span><span class="NormalTextRun SCXW101755402 BCX0">, etc.) on your </span><span class="NormalTextRun SCXW101755402 BCX0">machine.</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">Ollama</span><span class="NormalTextRun SCXW101755402 BCX0"> is available for macOS, Linux, and Windows</span><span class="NormalTextRun SCXW101755402 BCX0">, and it includes a local REST API for integration into applications.</span></span><span class="EOP SCXW101755402 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f83e139 elementor-widget elementor-widget-text-editor" data-id="f83e139" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW107598507 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW107598507 BCX0">1.<b> Install </b></span><b><span class="NormalTextRun SpellingErrorV2Themed SCXW107598507 BCX0">Ollama</span><span class="NormalTextRun SCXW107598507 BCX0"> on Your System:</span></b></span><span class="TextRun SCXW107598507 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW107598507 BCX0"><b> </b>Download the installer for your platform from the official </span><span class="NormalTextRun SpellingErrorV2Themed SCXW107598507 BCX0">Ollama</span><span class="NormalTextRun SCXW107598507 BCX0"> website or use a package manager.</span></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f34ab2b elementor-widget elementor-widget-text-editor" data-id="f34ab2b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0">On Windows, download the </span></span><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0"><b>OllamaSetup.exe</b></span></span><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0"> from the website and run </span><span class="NormalTextRun SCXW8978879 BCX0">it.</span><span class="NormalTextRun SCXW8978879 BCX0"> On Linux, you can install </span><span class="NormalTextRun SpellingErrorV2Themed SCXW8978879 BCX0">Ollama</span><span class="NormalTextRun SCXW8978879 BCX0"> with one command:</span></span><span class="EOP SCXW8978879 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b668357 elementor-widget elementor-widget-text-editor" data-id="b668357" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0">curl -</span><span class="NormalTextRun SpellingErrorV2Themed SCXW8325834 BCX0">fsSL</span> </span><a class="Hyperlink SCXW8325834 BCX0" href="https://ollama.com/install.sh" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0" data-ccp-charstyle="Hyperlink">https://ollama.com/install.sh</span></span></a><span class="TextRun SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0"> | </span><span class="NormalTextRun SpellingErrorV2Themed SCXW8325834 BCX0">sh</span></span><span class="EOP SCXW8325834 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c086d50 elementor-widget elementor-widget-text-editor" data-id="c086d50" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW172550952 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW172550952 BCX0">After installation, open a terminal/command prompt and verify </span><span class="NormalTextRun SCXW172550952 BCX0">it’s</span><span class="NormalTextRun SCXW172550952 BCX0"> installed by checking the version:</span></span><span class="EOP SCXW172550952 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-a98aedb elementor-widget elementor-widget-text-editor" data-id="a98aedb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW230245657 BCX0" lang="EN-GB" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; white-space-collapse: preserve; font-size: 11pt; line-height: 19.7625px; font-family: Consolas, Consolas_EmbeddedFont, Consolas_MSFontService, monospace; font-variant-ligatures: none !important;" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; background-position: 0px 100%; background-repeat: repeat-x; border-bottom: 1px solid transparent;">ollama</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;"> -</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;">-</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;">version</span></span><span class="EOP SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; white-space-collapse: preserve; font-size: 11pt; line-height: 19.7625px; font-family: Consolas, Consolas_EmbeddedFont, Consolas_MSFontService, monospace;" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-fe7e512 elementor-widget elementor-widget-text-editor" data-id="fe7e512" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW228829587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW228829587 BCX0">This should display the installed </span><span class="NormalTextRun SpellingErrorV2Themed SCXW228829587 BCX0">Ollama</span><span class="NormalTextRun SCXW228829587 BCX0"> version, confirming </span><span class="NormalTextRun SCXW228829587 BCX0">it’s</span><span class="NormalTextRun SCXW228829587 BCX0"> ready to </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW228829587 BCX0">use,</span><span class="NormalTextRun SCXW228829587 BCX0"> e.g.:</span></span><span class="EOP SCXW228829587 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8f0123b elementor-widget elementor-widget-text-editor" data-id="8f0123b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW19868586 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW19868586 BCX0">ollama</span><span class="NormalTextRun SCXW19868586 BCX0"> version is 0.6.2</span></span><span class="EOP SCXW19868586 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-cf0e477 elementor-widget elementor-widget-text-editor" data-id="cf0e477" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW20221182 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW20221182 BCX0">2<b>. Download an LLM Model (&#8220;Pull&#8221; a Model)</b>:</span></span><span class="TextRun SCXW20221182 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"> <span class="NormalTextRun SpellingErrorV2Themed SCXW20221182 BCX0">Ollama</span><span class="NormalTextRun SCXW20221182 BCX0"> has a built-in model library. You can search their </span><span class="NormalTextRun SpellingErrorV2Themed SCXW20221182 BCX0">catalog</span><span class="NormalTextRun SCXW20221182 BCX0"> on the website or simply pull a known model by name. For example, to download the 7B parameter Llama 2 chat model, run:</span></span><span class="EOP SCXW20221182 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-abab4bd elementor-widget elementor-widget-text-editor" data-id="abab4bd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW86029186 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW86029186 BCX0">ollama</span><span class="NormalTextRun SCXW86029186 BCX0"> pull llama2:7b-chat</span></span><span class="EOP SCXW86029186 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-e31da10 elementor-widget elementor-widget-text-editor" data-id="e31da10" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW158953993 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW158953993 BCX0">This command fetches the model weights to your machine (it may take a while, as models are multiple GBs in </span><span class="NormalTextRun SCXW158953993 BCX0">size)</span><span class="NormalTextRun SCXW158953993 BCX0">. You only need to pull a model once; </span><span class="NormalTextRun SCXW158953993 BCX0">afterward</span> <span class="NormalTextRun SCXW158953993 BCX0">it’s</span><span class="NormalTextRun SCXW158953993 BCX0"> stored locally. You can list all downloaded models with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW158953993 BCX0">ollama</span><span class="NormalTextRun SCXW158953993 BCX0"> list if needed.</span></span><span class="EOP SCXW158953993 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0fe8c48 elementor-widget elementor-widget-text-editor" data-id="0fe8c48" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW87322540 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87322540 BCX0">3. Run the Model Locally:</span></span></strong><span class="TextRun SCXW87322540 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87322540 BCX0"> Once downloaded, you can execute the model with the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW87322540 BCX0">ollama</span><span class="NormalTextRun SCXW87322540 BCX0"> run command. This will launch an interactive session where you can enter prompts and get responses. For example:</span></span><span class="EOP SCXW87322540 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-bb58a0a elementor-widget elementor-widget-text-editor" data-id="bb58a0a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW171041342 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW171041342 BCX0">ollama</span><span class="NormalTextRun SCXW171041342 BCX0"> run llama2:7b-chat &gt;&gt;&gt; What is the capital city of Poland?</span></span><span class="EOP SCXW171041342 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c17cc6e elementor-widget elementor-widget-text-editor" data-id="c17cc6e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">After running the above, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW99918251 BCX0">Ollama</span><span class="NormalTextRun SCXW99918251 BCX0"> will load the </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW99918251 BCX0">model</span><span class="NormalTextRun SCXW99918251 BCX0"> and </span><span class="NormalTextRun SCXW99918251 BCX0">you’ll</span><span class="NormalTextRun SCXW99918251 BCX0"> see an </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">&gt;&gt;&gt;</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> prompt. You can then type your questions or instructions. The model (here Llama 2 7B chat) will generate a response to each prompt. For instance, you might ask “What is the capital of France?” and get an answer like “Paris is the capital of France.” printed in the terminal. Internally, the first run may take a bit to initialize, but </span><span class="NormalTextRun SCXW99918251 BCX0">subsequent</span><span class="NormalTextRun SCXW99918251 BCX0"> prompts are answered interactively. </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">Tip:</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> You can also pass a one-off prompt directly in the command, e.g. </span><span class="NormalTextRun SpellingErrorV2Themed SCXW99918251 BCX0">ollama</span><span class="NormalTextRun SCXW99918251 BCX0"> run llama2:7b </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">&#8220;<b>What is the capital city of Poland?</b>&#8220;</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> will output a single response and return to the </span><span class="NormalTextRun SCXW99918251 BCX0">shell.</span></span><span class="EOP SCXW99918251 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-846e51f elementor-widget elementor-widget-text-editor" data-id="846e51f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW252478220 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW252478220 BCX0">You can also start </span><span class="NormalTextRun SpellingErrorV2Themed SCXW252478220 BCX0">Ollama</span><span class="NormalTextRun SCXW252478220 BCX0"> as a background server with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW252478220 BCX0">ollama</span><span class="NormalTextRun SCXW252478220 BCX0"> serve. This enables the REST API on localhost:11434, which developers can use to integrate the model into apps via HTTP </span><span class="NormalTextRun SCXW252478220 BCX0">calls.</span><span class="NormalTextRun SCXW252478220 BCX0"> You can ask the model by sending POST request, e.g.:</span></span><span class="EOP SCXW252478220 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7fa1a27 elementor-widget elementor-widget-text-editor" data-id="7fa1a27" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">curl </span></span><a class="Hyperlink SCXW24036424 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0"> -d </span></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">  "prompt": "What is the capital city of Poland?"</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">}'</span></span><span class="EOP SCXW24036424 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-77dd7b1 elementor-widget elementor-widget-text-editor" data-id="77dd7b1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW28772340 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW28772340 BCX0">The API returns newline-separated JSON objects, chunk by chunk, as the model generates the response:</span></span><span class="EOP SCXW28772340 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-943d033 elementor-widget elementor-widget-text-editor" data-id="943d033" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.1569954Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"The"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.268992Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" capital"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.3796491Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" city"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">...</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.3106413Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" Warszawa"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.4619772Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">")."</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.6296267Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">""</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">true</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">done_reason</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"stop"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">total_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 5337417000,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">load_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 8625100,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">prompt_eval_count</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 28,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">prompt_eval_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 854952300,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">eval_count</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 15,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">eval_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 4472807400</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="EOP SCXW52386783 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-10d3ffe elementor-widget elementor-widget-text-editor" data-id="10d3ffe" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">If you set stream: </span></span><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">false</span></span><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">, the response is a single JSON object:</span></span><span class="EOP SCXW26657317 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-af79c25 elementor-widget elementor-widget-text-editor" data-id="af79c25" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">curl </span></span><a class="Hyperlink SCXW81302069 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0"> -d </span></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "prompt": "What is the capital city of Poland?",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "stream": false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">}</span></span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-3d76430 elementor-widget elementor-widget-text-editor" data-id="3d76430" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">You can also set </span><span class="NormalTextRun SCXW62602235 BCX0">a number of</span><span class="NormalTextRun SCXW62602235 BCX0"> model parameters such as </span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">temperature</span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0"> by adding field </span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">options</span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">:</span></span><span class="EOP SCXW62602235 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-12fc8b0 elementor-widget elementor-widget-text-editor" data-id="12fc8b0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">curl </span></span><a class="Hyperlink SCXW121643900 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0"> -d </span></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "prompt": "What is the capital city of Poland?",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "options": {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span> <span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">"temperature": 0.2  </span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  }</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "stream": false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">}'</span></span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c478ddd elementor-widget elementor-widget-text-editor" data-id="c478ddd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0">4. <b>Customize Models:</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b> </b><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0">Ollama</span><span class="NormalTextRun SCXW13767485 BCX0"> supports a </span><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0">Dockerfile</span><span class="NormalTextRun SCXW13767485 BCX0">-like syntax called a </span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0"><b>Modelfile</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0"><b> </b>to create </span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0"><b>custom LLM variants</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0">. These let you:</span></span><span class="EOP SCXW13767485 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-dde64e1 elementor-widget elementor-widget-text-editor" data-id="dde64e1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">Start from an existing model (like </span><span data-contrast="none">llama3</span><span data-contrast="none">)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">Add custom system prompts</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">Inject user-defined data (e.g., instructions, context)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="none">Set model parameters, like temperature</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-9672691 elementor-widget elementor-widget-text-editor" data-id="9672691" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="none">Here is the simple example how you can create your custom assistant for processing insurance documents:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b3b173f elementor-widget elementor-widget-text-editor" data-id="b3b173f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">FROM llama2:7b-chat</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">PARAMETER temperature 0.7</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">SYSTEM </span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">You are an assistant that extracts insurance-related information from a given input text.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">You must extract and return only the following fields:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- policy_number</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- insurance_period</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- insured (company or person name)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- nip (tax identification number)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- address (of the insured)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Return the output as a **clean JSON object** -- not as a string, not inside quotes, and without any commentary. If a field is missing, use "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Not found</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">".</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Example output format:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW168916518 BCX0">policy_number</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW168916518 BCX0">insurance_period</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">insured</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">nip</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">address</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">TEMPLATE </span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{{ .</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW168916518 BCX0">System }</span><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Input:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{{ .</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW168916518 BCX0">Prompt }</span><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Response:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="EOP SCXW168916518 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.</pre>						</div>
				</div>
				<div class="elementor-element elementor-element-70f4001 elementor-widget elementor-widget-text-editor" data-id="70f4001" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="NormalTextRun SCXW237655292 BCX0">To use </span><span class="NormalTextRun SpellingErrorV2Themed SCXW237655292 BCX0">Makefile</span><span class="NormalTextRun SCXW237655292 BCX0">, save it in a directory, e.g. insurance-</span><span class="NormalTextRun SCXW237655292 BCX0">a</span><span class="NormalTextRun SCXW237655292 BCX0">ssistant</span><span class="NormalTextRun SCXW237655292 BCX0"> and create the custom model:</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5d4fca1 elementor-widget elementor-widget-text-editor" data-id="5d4fca1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW150813743 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">ollama</span><span class="NormalTextRun SCXW150813743 BCX0"> create insurance-assistant -f insurance-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">assitant</span><span class="NormalTextRun SCXW150813743 BCX0">/</span><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">Modelfile</span></span><span class="EOP SCXW150813743 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-3cfba04 elementor-widget elementor-widget-text-editor" data-id="3cfba04" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="none">Then, you can use your model by providing the proper model name in a request:</span> </p>						</div>
				</div>
				<div class="elementor-element elementor-element-607dc80 elementor-widget elementor-widget-text-editor" data-id="607dc80" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span><span data-contrast="none">curl </span><a href="http://localhost:11434/api/generate"><span data-contrast="none">http://localhost:11434/api/generate</span></a><span data-contrast="none"> -d </span><span data-contrast="none">'{</span> <br /><span data-contrast="none">  "model": "insurance-extractor",</span> <br /><span data-contrast="none">  "prompt": "",</span> <br /><span data-contrast="none">  "stream": false</span> <br /><span data-contrast="none">}'</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-bc5221c elementor-widget elementor-widget-text-editor" data-id="bc5221c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW210001513 BCX0">Ollama</span><span class="NormalTextRun SCXW210001513 BCX0"> is purely CLI-based, so </span><span class="NormalTextRun SCXW210001513 BCX0">there’s</span><span class="NormalTextRun SCXW210001513 BCX0"> no graphical interface. However, this makes it </span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">powerful for automation</span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0"> – you can pipe input/output, log responses to files, or call the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW210001513 BCX0">Ollama</span><span class="NormalTextRun SCXW210001513 BCX0"> API from code. In summary, with just a few commands, you have a privacy-protecting LLM running on your PC, ready to answer questions or </span><span class="NormalTextRun SCXW210001513 BCX0">assist</span><span class="NormalTextRun SCXW210001513 BCX0"> in coding, all </span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">without any internet connection needed</span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">.</span></span><span class="EOP SCXW210001513 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ce97082 elementor-widget elementor-widget-heading" data-id="ce97082" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Getting Started with LM Studio (Desktop App) </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f911779 elementor-widget elementor-widget-image" data-id="f911779" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7711" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" class="attachment-large size-large wp-image-7711" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM1.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7711" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-14c3b4c elementor-widget elementor-widget-text-editor" data-id="14c3b4c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><b><span data-contrast="none">LM Studio</span></b><span data-contrast="none"> is a user-friendly desktop application that lets you </span><b><span data-contrast="none">download and run local LLMs via a graphical interface</span></b><span data-contrast="none">. It’s cross-platform (Windows, macOS, Linux) and ideal for beginners who prefer not to use the command line. With LM Studio, you can chat with models in a nice UI, manage model downloads, and even run a local server to use the model in other apps.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p><p><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-623f7d7 elementor-widget elementor-widget-text-editor" data-id="623f7d7" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b>1. Install and Launch LM Studio:</b></span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"> Download the installer for your OS from the LM Studio website and install it. After installation, launch the </span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b> app</b>. The first time you open </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW122283343 BCX0">it,</span> <span class="NormalTextRun SCXW122283343 BCX0">you’ll</span><span class="NormalTextRun SCXW122283343 BCX0"> be prompted to download an AI model. You can choose from a list of popular open-source models. For example, you might select a smaller model like “Mistral 7B” or an instruction-tuned Llama2 variant to start.</span></span><span class="EOP SCXW122283343 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-eef6a33 elementor-widget elementor-widget-text-editor" data-id="eef6a33" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">2. Run Your First Chat:</span></span></strong><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0"> Once the model is downloaded, LM Studio will load it into memory. You can then start a new chat session in the app. The interface typically has a text box where you can enter your prompt or question, and the model’s response will appear in the chat window. Simply type a query (for example: </span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">“What’s the capital of France?”</span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0"> or </span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">“Explain quantum physics simply.”</span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">) and hit Enter. The AI’s answer will be displayed as the “Assistant” reply in the chat. LM Studio conveniently shows the generation metrics:</span></span><span class="EOP SCXW160100961 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-e83ff6c elementor-widget elementor-widget-text-editor" data-id="e83ff6c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">number of input and output tokens,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">tokens per second &#8211; you can see how fast the model is generating text,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">context occupancy,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="none">system resources usage (RAM and processor usage).</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-4c50be9 elementor-widget elementor-widget-text-editor" data-id="4c50be9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW47407587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47407587 BCX0">3. Explore the Features:</span></span></strong><span class="TextRun SCXW47407587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47407587 BCX0"> The LM Studio GUI provides </span><span class="NormalTextRun SCXW47407587 BCX0">additional</span><span class="NormalTextRun SCXW47407587 BCX0"> features accessible to both beginners and advanced users:</span></span><span class="EOP SCXW47407587 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0bc4155 elementor-widget elementor-widget-text-editor" data-id="0bc4155" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW37213095 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW37213095 BCX0">Model Library:</span></span></strong><span class="TextRun SCXW37213095 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW37213095 BCX0"> A “Discover Models” or </span><span class="NormalTextRun SpellingErrorV2Themed SCXW37213095 BCX0">catalog</span><span class="NormalTextRun SCXW37213095 BCX0"> section where you can download new models or update existing ones. </span><span class="NormalTextRun SCXW37213095 BCX0">You’re</span><span class="NormalTextRun SCXW37213095 BCX0"> not limited to one model – you can have multiple models stored and switch between them. This means you have a wide selection: from small 3B parameter models for speed, up to 70B models if your system can handle them.</span></span><span class="EOP SCXW37213095 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-c3af6d0 elementor-widget elementor-widget-text-editor" data-id="c3af6d0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW224090495 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW224090495 BCX0">Chat Interface:</span></span></strong><span class="TextRun SCXW224090495 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW224090495 BCX0"> The main chat screen (as shown above) is where you interact with the model. Each new prompt you enter is answered by the model in a conversational format. You can have multi-turn dialogues, just like chatting with ChatGPT. </span><span class="NormalTextRun SCXW224090495 BCX0">There’s</span><span class="NormalTextRun SCXW224090495 BCX0"> no need to manage a prompt history manually – the app keeps the conversation context.</span></span><span class="EOP SCXW224090495 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-fcf03f1 elementor-widget elementor-widget-text-editor" data-id="fcf03f1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">Advanced Settings:</span></span></strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> On the side panel, LM Studio offers configuration knobs for those who want more control. You can set a </span></span><strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">system prompt</span></span></strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> (a role or instruction that guides the AI’s </span><span class="NormalTextRun SpellingErrorV2Themed SCXW52102321 BCX0">behavior</span><span class="NormalTextRun SCXW52102321 BCX0"> globally), adjust generation settings like </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">temperature</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> (creativity vs. consistency) and </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">top-p</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> or </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">top-k</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> sampling for controlling randomness, max tokens for responses, etc. These options let you fine-tune how the model responds without writing any code. For instance, you could set a system instruction like “You are a helpful coding assistant,</span><span class="NormalTextRun SCXW52102321 BCX0">”.</span><span class="NormalTextRun SCXW52102321 BCX0"> This is a friendly way to customize </span><span class="NormalTextRun SpellingErrorV2Themed SCXW52102321 BCX0">behavior</span><span class="NormalTextRun SCXW52102321 BCX0">, though </span><span class="NormalTextRun SCXW52102321 BCX0">it’s</span><span class="NormalTextRun SCXW52102321 BCX0"> not as extensive as programmatic control in a CLI tool.</span></span><span class="EOP SCXW52102321 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-5378615 elementor-widget elementor-widget-image" data-id="5378615" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7710" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" class="attachment-large size-large wp-image-7710" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM2.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7710" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-8d32129 elementor-widget elementor-widget-text-editor" data-id="8d32129" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="NormalTextRun SCXW87331471 BCX0">Advanced settings – </span><span class="NormalTextRun SCXW87331471 BCX0">simple </span><span class="NormalTextRun SCXW87331471 BCX0">example of </span><span class="NormalTextRun SCXW87331471 BCX0">AI assistant</span><span class="NormalTextRun SCXW87331471 BCX0"> for processing insurance documents</span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-88a1e0d elementor-widget elementor-widget-text-editor" data-id="88a1e0d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW87021791 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87021791 BCX0"><strong>Local API Server</strong>:</span></span><span class="TextRun SCXW87021791 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87021791 BCX0"> For developers, LM Studio includes a “Local LLM Server” mode. Just switch to Developer tab, choose the model, and toggle Start button. It enables an API endpoint on localhost that mimics the OpenAI API, allowing other programs to send requests to your local </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW87021791 BCX0">model.</span><span class="NormalTextRun SCXW87021791 BCX0"> This is powerful if you want to integrate the local LLM into your own applications (for example, connecting a chatbot UI or using the model for AI features in an IDE) while still </span><span class="NormalTextRun SCXW87021791 BCX0">benefiting</span><span class="NormalTextRun SCXW87021791 BCX0"> from privacy and not relying on external services.</span></span><span class="EOP SCXW87021791 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-369df09 elementor-widget elementor-widget-image" data-id="369df09" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7709" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm3/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" class="attachment-large size-large wp-image-7709" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM3.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7709" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm3/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-5937737 elementor-widget elementor-widget-text-editor" data-id="5937737" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW123659257 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW123659257 BCX0">Developer tab</span><span class="NormalTextRun SCXW123659257 BCX0"> &#8211;</span><span class="NormalTextRun SCXW123659257 BCX0"> you can </span><span class="NormalTextRun SCXW123659257 BCX0">enable</span><span class="NormalTextRun SCXW123659257 BCX0"> local LLM server</span><span class="NormalTextRun SCXW123659257 BCX0"> hosting your customized LLM</span><span class="NormalTextRun SCXW123659257 BCX0">.</span></span><span class="EOP SCXW123659257 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:720,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-06d2cee elementor-widget elementor-widget-text-editor" data-id="06d2cee" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0">Using LM Studio is as simple as </span><span class="NormalTextRun SpellingErrorV2Themed SCXW66765914 BCX0">chatGPT</span><span class="NormalTextRun SCXW66765914 BCX0"> – type and get answers – but entirely running on your hardware. The </span></span><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0">user-friendly interface</span></span><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0"> lowers the barrier to </span><span class="NormalTextRun SCXW66765914 BCX0">entry, since</span><span class="NormalTextRun SCXW66765914 BCX0"> you </span><span class="NormalTextRun SCXW66765914 BCX0">don’t</span><span class="NormalTextRun SCXW66765914 BCX0"> need to use the terminal or remember commands. You get immediate, interactive AI responses, with buttons and menus to manage everything.</span></span><span class="EOP SCXW66765914 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b60b847 elementor-widget elementor-widget-heading" data-id="b60b847" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Ollama vs. LM Studio: Tool Comparison </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-2d19913 elementor-widget elementor-widget-text-editor" data-id="2d19913" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW228077632 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW228077632 BCX0">Both </span><span class="NormalTextRun SpellingErrorV2Themed SCXW228077632 BCX0">Ollama</span><span class="NormalTextRun SCXW228077632 BCX0"> and LM Studio let you run LLMs locally, but they cater to slightly different audiences and use-cases. </span><span class="NormalTextRun SCXW228077632 BCX0">Here’s</span><span class="NormalTextRun SCXW228077632 BCX0"> a comparison of key aspects to help you understand their differences:</span></span><span class="EOP SCXW228077632 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-872db63 elementor-widget elementor-widget-text-editor" data-id="872db63" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"><b>Interface &amp; Ease of Use</b>:</span></span> <span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">LM Studio</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"> provides a polished </span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">graphical user interface</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">, which makes it extremely approachable for beginners. </span><span class="NormalTextRun SCXW183865909 BCX0">It’s</span><span class="NormalTextRun SCXW183865909 BCX0"> point-and-click with an integrated chat window, so no technical knowledge is </span><span class="NormalTextRun SCXW183865909 BCX0">required</span><span class="NormalTextRun SCXW183865909 BCX0"> to get </span><span class="NormalTextRun SCXW183865909 BCX0">started.</span> </span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW183865909 BCX0">Ollama</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">, on the other hand, is a </span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">command-line interface (CLI)</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"> tool (with an optional REST API). It offers a lot of power and </span><span class="NormalTextRun SCXW183865909 BCX0">flexibility but</span><span class="NormalTextRun SCXW183865909 BCX0"> does require comfort with the terminal to use </span><span class="NormalTextRun SCXW183865909 BCX0">effectively.</span><span class="NormalTextRun SCXW183865909 BCX0"> Beginners might find </span><span class="NormalTextRun SpellingErrorV2Themed SCXW183865909 BCX0">Ollama’s</span><span class="NormalTextRun SCXW183865909 BCX0"> learning curve steeper, </span><span class="NormalTextRun SCXW183865909 BCX0">whereas</span><span class="NormalTextRun SCXW183865909 BCX0"> LM Studio feels more plug-and-play.</span></span><span class="EOP SCXW183865909 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-4f4ccfd elementor-widget elementor-widget-text-editor" data-id="4f4ccfd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul>
<li style="list-style-type: none;">
<ul>
<li><span class="TextRun SCXW89912573 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW89912573 BCX0"><b>Supported Models:</b></span></span><span class="TextRun SCXW89912573 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW89912573 BCX0"> Both tools support a wide range of open-source LLMs. LM Studio can load any model in GGUF format (the standard for llama.cpp), meaning models like Llama 2 (7B, 13B, 70B), Mistral, Vicuna, Alpaca, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW89912573 BCX0">CodeLlama</span><span class="NormalTextRun SCXW89912573 BCX0">, etc., </span><span class="NormalTextRun SCXW89912573 BCX0">as long as</span><span class="NormalTextRun SCXW89912573 BCX0"> you have the hardware for them&nbsp;</span></span><span class="EOP SCXW89912573 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}">&nbsp;</span></li>
</ul>
</li>
</ul>						</div>
				</div>
				<div class="elementor-element elementor-element-8a420f5 elementor-widget elementor-widget-text-editor" data-id="8a420f5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul>
<li style="list-style-type: none;">
<ul>
<li><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>Use Cases Suited</b>:</span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"> Because of the above differences, </span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>LM Studio is excellent for users who want a personal ChatGPT-like assistant on their PC with minimal setup</b></span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0">. </span><span class="NormalTextRun SCXW72859417 BCX0">It’s</span><span class="NormalTextRun SCXW72859417 BCX0"> great for interactive Q&amp;A, brainstorming, or casual use – you launch it when you need it, type queries, get answers. </span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">Ollama</span><span class="NormalTextRun SCXW72859417 BCX0"> is ideal for developers or those who want to incorporate LLMs into projects or workflows</span></b></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>.</b> If you plan to experiment with prompts in scripts, fine-tune model </span><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">behaviors</span><span class="NormalTextRun SCXW72859417 BCX0">, or build an app (like a chatbot, a coding assistant integration, etc.) that calls a local model, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">Ollama’s</span><span class="NormalTextRun SCXW72859417 BCX0"> CLI and API give you that flexibility.</span></span><span class="EOP SCXW72859417 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}">&nbsp;</span></li>
</ul>
</li>
</ul>						</div>
				</div>
				<div class="elementor-element elementor-element-b4240ce elementor-widget elementor-widget-heading" data-id="b4240ce" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion and Recommendations </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-a417a13 elementor-widget elementor-widget-text-editor" data-id="a417a13" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0">Deploying LLMs locally has </span><span class="NormalTextRun SCXW229049291 BCX0">opened up</span><span class="NormalTextRun SCXW229049291 BCX0"> a world of possibilities for developers and enthusiasts. </span><span class="NormalTextRun SCXW229049291 BCX0">We’ve</span><span class="NormalTextRun SCXW229049291 BCX0"> discussed </span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW229049291 BCX0"><b>Ollama</b></span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"> and </span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"><b> </b>– two excellent tools that make local AI accessible. To recap some guidance on choosing between them:</span></span><span class="EOP SCXW229049291 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-93b9fb0 elementor-widget elementor-widget-text-editor" data-id="93b9fb0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"><b>Choose LM Studio</b></span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"> if you want a </span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"><b>plug-and-play AI chat experience</b></span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"> with a friendly GUI. </span><span class="NormalTextRun SCXW193410475 BCX0">It’s</span><span class="NormalTextRun SCXW193410475 BCX0"> perfect for beginners or those who prefer not to tinker with command lines. You get quick setup, easy model downloads, and a nice chat interface for </span><span class="NormalTextRun SCXW193410475 BCX0">interactions.</span><span class="NormalTextRun SCXW193410475 BCX0"> This might be best for someone who just wants an “offline ChatGPT” for personal use, note-taking, or idea generation without fussing over configurations. </span><span class="NormalTextRun SCXW193410475 BCX0">It’s</span><span class="NormalTextRun SCXW193410475 BCX0"> also a convenient way to demo LLM capabilities to non-technical users (since it feels like a normal app).</span></span><span class="EOP SCXW193410475 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-dcecadd elementor-widget elementor-widget-text-editor" data-id="dcecadd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b><span class="NormalTextRun SCXW242147822 BCX0">Choose </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Ollama</span></b></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0"><b> </b>if you want </span></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0">more<b> control, automation, or integration</b></span></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0"><b>.</b> Developers and power users will appreciate its flexibility – you can script it, run it headless on a server, integrate the local LLM into your own apps via the API, and fine-tune model </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">behavior</span><span class="NormalTextRun SCXW242147822 BCX0"> with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Modelfiles</span><span class="NormalTextRun SCXW242147822 BCX0"> . If </span><span class="NormalTextRun SCXW242147822 BCX0">you’re</span><span class="NormalTextRun SCXW242147822 BCX0"> comfortable with a terminal and want to customize how the AI works (beyond what a GUI allows), </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Ollama</span><span class="NormalTextRun SCXW242147822 BCX0"> is a better fit. </span><span class="NormalTextRun SCXW242147822 BCX0">It’s</span><span class="NormalTextRun SCXW242147822 BCX0"> also lightweight if you intend to run background AI services continuously.</span></span><span class="EOP SCXW242147822 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-cd8dc50 elementor-widget elementor-widget-text-editor" data-id="cd8dc50" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0">Finally, remember that the </span></span><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0">LLM itself</span></span><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0"> (the model you choose) is as important as the tool. Spend time finding a model that suits your task – whether </span><span class="NormalTextRun SCXW16460031 BCX0">it’s</span><span class="NormalTextRun SCXW16460031 BCX0"> a concise summarizer or a creative storyteller – and fits your hardware. Both </span><span class="NormalTextRun SpellingErrorV2Themed SCXW16460031 BCX0">Ollama</span><span class="NormalTextRun SCXW16460031 BCX0"> and LM Studio make it easy to swap models, so </span><span class="NormalTextRun SCXW16460031 BCX0">you’re</span><span class="NormalTextRun SCXW16460031 BCX0"> not locked in. The ecosystem of open-source models is growing rapidly, which means running a powerful AI on your own device is only getting easier and more common.</span></span><span class="EOP SCXW16460031 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5a3b149 elementor-widget elementor-widget-text-editor" data-id="5a3b149" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0">In summary</span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0">, deploying LLMs locally with these tools gives you the best of both worlds: AI capabilities </span><span class="NormalTextRun SCXW154420196 BCX0">similar to</span><span class="NormalTextRun SCXW154420196 BCX0"> cloud services, but with </span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0"><b>privacy, control, and zero ongoing cost</b></span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0"><b>.</b> Whether you go with a command-line power tool like </span><span class="NormalTextRun SpellingErrorV2Themed SCXW154420196 BCX0">Ollama</span><span class="NormalTextRun SCXW154420196 BCX0"> or a user-friendly app like LM Studio, </span><span class="NormalTextRun SCXW154420196 BCX0">you’ll</span><span class="NormalTextRun SCXW154420196 BCX0"> be joining the </span><span class="NormalTextRun SCXW154420196 BCX0">cutting edge</span><span class="NormalTextRun SCXW154420196 BCX0"> of local AI development. Happy </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW154420196 BCX0">experimenting, and</span><span class="NormalTextRun SCXW154420196 BCX0"> enjoy your new personal AI running right on your machine!</span></span><span class="EOP SCXW154420196 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">Deploying LLMs Locally: A Guide to Ollama and LM Studio</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7692</post-id>	</item>
		<item>
		<title>OpenAI vs. DeepSeek: A Technical Comparison Using Unified APIs</title>
		<link>https://inero-software.com/openai-vs-deepseek-a-technical-comparison-using-unified-apis/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Fri, 14 Mar 2025 13:35:14 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI Algorithms]]></category>
		<category><![CDATA[DeepSeek]]></category>
		<category><![CDATA[OpenAI]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7564</guid>

					<description><![CDATA[<p> In this post, we conduct a comparative analysis of three popular LLMs—OpenAI’s GPT based models: 4o-mini and o3-mini, and open-source DeepSeek R1—to evaluate their effectiveness in reading and analyzing statistical data from large PDFs. </p>
<p>Artykuł <a href="https://inero-software.com/openai-vs-deepseek-a-technical-comparison-using-unified-apis/">OpenAI vs. DeepSeek: A Technical Comparison Using Unified APIs</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7564" class="elementor elementor-7564" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-4c59976 e-flex e-con-boxed e-con e-parent" data-id="4c59976" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-5cf6ead elementor-widget elementor-widget-html" data-id="5cf6ead" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-a037070 elementor-widget elementor-widget-text-editor" data-id="a037070" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4><span class="TextRun SCXW23850730 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW23850730 BCX0">Large language models (LLMs) are increasingly used to </span><span class="NormalTextRun SpellingErrorV2Themed SCXW23850730 BCX0">analyze</span><span class="NormalTextRun SCXW23850730 BCX0"> and extract insights from extensive documents, including lengthy statistical reports in PDF format. However, not all models perform equally when processing large files, especially those exceeding 50 pages. In this post, we conduct a comparative analysis of three popular LLMs—OpenAI</span><span class="NormalTextRun SCXW23850730 BCX0">’s GPT based models:</span><span class="NormalTextRun SCXW23850730 BCX0"> 4o-mini</span><span class="NormalTextRun SCXW23850730 BCX0"> and</span><span class="NormalTextRun SCXW23850730 BCX0"> o3-mini, and open-source </span><span class="NormalTextRun SpellingErrorV2Themed SCXW23850730 BCX0">DeepSeek</span><span class="NormalTextRun SCXW23850730 BCX0"> R1—to evaluate their effectiveness in reading and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW23850730 BCX0">analyzing</span><span class="NormalTextRun SCXW23850730 BCX0"> statistical data from large PDFs. Our assessment focuses on three key factors: accuracy, response time, and cost estimation for each model.</span></span><span class="EOP SCXW23850730 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}">&nbsp;</span></h4>						</div>
				</div>
				<div class="elementor-element elementor-element-237cac6 elementor-widget elementor-widget-text-editor" data-id="237cac6" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW241218521 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241218521 BCX0">To ensure a fair comparison, we utilized </span><span class="NormalTextRun SpellingErrorV2Themed SCXW241218521 BCX0">LiteLLM</span><span class="NormalTextRun SCXW241218521 BCX0">, a unified API that simplifies multi-model </span><span class="NormalTextRun SCXW241218521 BCX0">LLM </span><span class="NormalTextRun SCXW241218521 BCX0">benchmarking. By standardizing interactions across different LLM providers, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW241218521 BCX0">LiteLLM</span><span class="NormalTextRun SCXW241218521 BCX0"> allowed us to focus on </span><span class="NormalTextRun SCXW241218521 BCX0">evaluating LLM performance</span><span class="NormalTextRun SCXW241218521 BCX0"> metrics rather than implementation differences.</span></span><span class="EOP SCXW241218521 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-53a2380 elementor-widget elementor-widget-heading" data-id="53a2380" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">A Unified API Approach </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f569e31 elementor-widget elementor-widget-text-editor" data-id="f569e31" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW97117196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW97117196 BCX0">Comparing open-source and proprietary LLMs from different providers can be challenging due to their varying APIs. To standardize our testing, we utilized </span><span class="NormalTextRun SpellingErrorV2Themed SCXW97117196 BCX0">LiteLLM</span><span class="NormalTextRun SCXW97117196 BCX0">, a library that provides a consistent interface for interacting with multiple LLMs. This allowed for easier switching between models and </span><span class="NormalTextRun SCXW97117196 BCX0">facilitated</span><span class="NormalTextRun SCXW97117196 BCX0"> a more objective </span></span><span class="TextRun SCXW97117196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW97117196 BCX0">AI model comparison</span></span><span class="TextRun SCXW97117196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW97117196 BCX0">. Here is how easy it is to switch models using </span><span class="NormalTextRun SpellingErrorV2Themed SCXW97117196 BCX0">LiteLLM’s</span><span class="NormalTextRun SCXW97117196 BCX0"> unified API:</span></span><span class="EOP SCXW97117196 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-277031f e-flex e-con-boxed e-con e-parent" data-id="277031f" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-66b9211 elementor-widget elementor-widget-text-editor" data-id="66b9211" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW177913088 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177913088 BCX0">import litellm</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="TextRun SCXW177913088 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177913088 BCX0"># To use </span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">openai</span><span class="NormalTextRun SCXW177913088 BCX0">.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="TextRun SCXW177913088 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177913088 BCX0">response = </span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">litellm.completion</span><span class="NormalTextRun SCXW177913088 BCX0">(model="</span><span class="NormalTextRun SCXW177913088 BCX0">o3-mini</span><span class="NormalTextRun SCXW177913088 BCX0">", messages</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW177913088 BCX0">=[</span><span class="NormalTextRun SCXW177913088 BCX0">{"content": "Hello", "role": "user"}])</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="TextRun SCXW177913088 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177913088 BCX0"># To use </span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">deepseek</span><span class="NormalTextRun SCXW177913088 BCX0">.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span><span class="TextRun SCXW177913088 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177913088 BCX0">response = </span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">litellm.completion</span><span class="NormalTextRun SCXW177913088 BCX0">(model="</span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">deepseek</span><span class="NormalTextRun SCXW177913088 BCX0">/</span><span class="NormalTextRun SpellingErrorV2Themed SCXW177913088 BCX0">deepseek</span><span class="NormalTextRun SCXW177913088 BCX0">-</span><span class="NormalTextRun SCXW177913088 BCX0">reasoner</span><span class="NormalTextRun SCXW177913088 BCX0">", messages</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW177913088 BCX0">=[</span><span class="NormalTextRun SCXW177913088 BCX0">{"content": "Hello", "role": "user"}])</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW177913088 BCX0"><span class="SCXW177913088 BCX0"> </span><br class="SCXW177913088 BCX0" /></span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-ad9012a elementor-widget elementor-widget-text-editor" data-id="ad9012a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW231103637 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW231103637 BCX0">This simplified approach helped us compare models without worrying about implementation complexities.</span></span><span class="EOP SCXW231103637 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-bdf2fca elementor-widget elementor-widget-heading" data-id="bdf2fca" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">DeepSeek vs. OpenAI – model overview </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-2a9d55a elementor-widget elementor-widget-text-editor" data-id="2a9d55a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW72884465 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW72884465 BCX0">Before diving into the</span><span class="NormalTextRun SCXW72884465 BCX0"> AI model</span><span class="NormalTextRun SCXW72884465 BCX0"> benchmarking results, </span><span class="NormalTextRun SCXW72884465 BCX0">let&#8217;s</span><span class="NormalTextRun SCXW72884465 BCX0"> define key concepts and introduce the core specifications of the tested models.</span></span><span class="EOP SCXW72884465 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3070cc7 elementor-widget elementor-widget-text-editor" data-id="3070cc7" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW16640192 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW16640192 BCX0">One of the most important parameters to consider</span><span class="NormalTextRun SCXW16640192 BCX0"> in LLM benchmarking</span><span class="NormalTextRun SCXW16640192 BCX0"> is the </span></span><span class="TextRun SCXW16640192 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW16640192 BCX0">context window</span></span><span class="TextRun SCXW16640192 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW16640192 BCX0">—the maximum input size a model can process at once. This is measured in tokens, which represent chunks of text rather than individual words. A larger context window allows the model to handle more extensive documents in a single request, which is particularly important when working with long statistical reports.</span></span><span class="EOP SCXW16640192 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d7ab0b3 elementor-widget elementor-widget-text-editor" data-id="d7ab0b3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW22985181 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW22985181 BCX0">The pricing for LLMs is typically based on token usage, which can vary depending on the type of tokens being processed. There are </span><span class="NormalTextRun SCXW22985181 BCX0">generally three</span><span class="NormalTextRun SCXW22985181 BCX0"> types of tokens involved in LLM pricing:</span></span><span class="EOP SCXW22985181 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-65a6290 elementor-widget elementor-widget-text-editor" data-id="65a6290" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li data-leveltext="%1." data-font="Aptos" data-listid="12" data-list-defn-props="{&quot;335551671&quot;:1,&quot;335552541&quot;:0,&quot;335559683&quot;:0,&quot;335559684&quot;:-1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0,46],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Input Tokens</span></b><span data-contrast="auto">: These are the tokens representing the user’s input, such as the text or prompt sent to the model for processing. The cost of input tokens is charged based on the number of tokens provided by the user in each request.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li><li><b><span data-contrast="auto">Cached Input Tokens</span></b><span data-contrast="auto">: Some models offer a caching mechanism, where previously used inputs are stored and reused in subsequent requests, reducing the need for reprocessing. This is often charged at a lower rate than fresh input tokens, as the model does not need to process them again from scratch.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li><li><b><span data-contrast="auto">Output Tokens</span></b><span data-contrast="auto">: These tokens represent the text or response generated by the model. Output tokens are charged based on the amount of text the model generates in response to the user&#8217;s input.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-d9daaed elementor-widget elementor-widget-text-editor" data-id="d9daaed" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">The models selected for this comparison are among the latest releases from the past </span><span class="NormalTextRun SCXW245291604 BCX0">several</span><span class="NormalTextRun SCXW245291604 BCX0"> months. While they differ in pricing and capabilities, we aim to assess whether these differences translate into measurable performance variations. Below is a breakdown of the key characteristics of </span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">DeepSeek-R1</span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">, </span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">OpenAI 4o-mini</span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">, and </span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">OpenAI o3-mini</span></span><span class="TextRun SCXW245291604 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW245291604 BCX0">:</span></span><span class="EOP SCXW245291604 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-c83f80c elementor-widget elementor-widget-html" data-id="c83f80c" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
    table {
        width: 100%;
        border-collapse: collapse;
        font-family: 'Roboto', sans-serif;
        font-weight: 300;
        font-size: 14px;
        color: #1C244B;
    }
    th, td {
        border: 1px solid #ddd;
        padding: 8px;
        text-align: left;
    }
    th {
        background-color: #f4f4f4;
        color: #1C244B;
    }
    tr:nth-child(even) {
        background-color: #f9f9f9;
    }
</style>

<table>
    <thead>
        <tr>
            <th></th>
            <th>DeepSeek-R1</th>
            <th>OpenAI 4o-mini</th>
            <th>OpenAI o3-mini</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td><strong>Context Window</strong></td>
            <td>128,000 tokens</td>
            <td>128,000 tokens (with a maximum output of 16,384 tokens)</td>
            <td>200,000 tokens (with a maximum output of 100,000 tokens)</td>
        </tr>
        <tr>
            <td><strong>Release Date</strong></td>
            <td>January 2025</td>
            <td>July 2024</td>
            <td>January 2025</td>
        </tr>
        <tr>
            <td><strong>Pricing (per 1 million tokens)</strong></td>
            <td>Input: $0.55<br>Cached input: $0.14<br>Output: $2.19</td>
            <td>Input: $0.15<br>Cached input: $0.075<br>Output: $0.60</td>
            <td>Input: $1.10<br>Cached input: $0.55<br>Output: $4.40</td>
        </tr>
        <tr>
            <td><strong>Input Formats</strong></td>
            <td>Text</td>
            <td>Text, Images (including PNG, JPEG, GIF, WEBP)</td>
            <td>Text</td>
        </tr>
        <tr>
            <td><strong>Output Formats</strong></td>
            <td>Text</td>
            <td>Text</td>
            <td>Text</td>
        </tr>
    </tbody>
</table>

<!-- Link to Google Fonts -->
<link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
		</div>
				</div>
				<div class="elementor-element elementor-element-fb0549b elementor-widget elementor-widget-heading" data-id="fb0549b" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">PDF file used for testing </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-898fdb5 elementor-widget elementor-widget-text-editor" data-id="898fdb5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW165493897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW165493897 BCX0">The document </span><span class="NormalTextRun SCXW165493897 BCX0">used for testing</span><span class="NormalTextRun SCXW165493897 BCX0"> is </span><span class="NormalTextRun SCXW165493897 BCX0">composed of several chapters</span><span class="NormalTextRun SCXW165493897 BCX0"> of</span><span class="NormalTextRun SCXW165493897 BCX0"> report on the Polish and worldwide maritime economy in 20</span><span class="NormalTextRun SCXW165493897 BCX0">17-2020</span><span class="NormalTextRun SCXW165493897 BCX0">. The report</span><span class="NormalTextRun SCXW165493897 BCX0"> is </span></span><span class="TextRun SCXW165493897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW165493897 BCX0">50 pages long</span></span><span class="TextRun SCXW165493897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"> <span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW165493897 BCX0">and </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW165493897 BCX0"> includes</span> <span class="NormalTextRun SCXW165493897 BCX0">various </span><span class="NormalTextRun SCXW165493897 BCX0">statistics and analysis of cargo traffic, shipping, shipbuilding, and other maritime industries. The data in the file is formatted in tables and text. Most of the data is presented in tables, with </span><span class="NormalTextRun SCXW165493897 BCX0">additional</span><span class="NormalTextRun SCXW165493897 BCX0"> explanations and summaries in the surrounding text.</span><span class="NormalTextRun SCXW165493897 BCX0"> Example pages of the document used for testing:</span></span><span class="EOP SCXW165493897 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cf84d20 elementor-widget elementor-widget-image" data-id="cf84d20" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7573" data-permalink="https://inero-software.com/openai-vs-deepseek-a-technical-comparison-using-unified-apis/grafika-14032025/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025.png" data-orig-size="2000,1414" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="grafika 14032025" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-300x212.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-1030x728.png" tabindex="0" role="button" width="1030" height="728" src="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-1030x728.png" class="attachment-large size-large wp-image-7573" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-1030x728.png 1030w, https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-300x212.png 300w, https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-768x543.png 768w, https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-1536x1086.png 1536w, https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-424x300.png 424w, https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025.png 2000w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7573" data-permalink="https://inero-software.com/openai-vs-deepseek-a-technical-comparison-using-unified-apis/grafika-14032025/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025.png" data-orig-size="2000,1414" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="grafika 14032025" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-300x212.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/03/grafika-14032025-1030x728.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-e46d19d elementor-widget elementor-widget-heading" data-id="e46d19d" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Testing Methodology </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-1ba1d08 elementor-widget elementor-widget-text-editor" data-id="1ba1d08" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW149358593 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW149358593 BCX0">We conducted a series of tests using the following maritime economy-themed </span><span class="NormalTextRun SCXW149358593 BCX0">prompts</span><span class="NormalTextRun SCXW149358593 BCX0"> and </span><span class="NormalTextRun SCXW149358593 BCX0">a </span><span class="NormalTextRun SCXW149358593 BCX0">PDF file providing context information. </span><span class="NormalTextRun SCXW149358593 BCX0">Here are example prompts</span> <span class="NormalTextRun SCXW149358593 BCX0">regarding</span><span class="NormalTextRun SCXW149358593 BCX0"> information included in the PDF</span><span class="NormalTextRun SCXW149358593 BCX0">:</span></span><span class="EOP SCXW149358593 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4228b97 elementor-widget elementor-widget-text-editor" data-id="4228b97" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Summarize the key economic findings from a maritime report.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">What is the total cargo turnover of Polish sea ports in 2020?</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">What are the main cargo types handled by Polish sea ports?</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="auto">Which countries are the main trading partners of Poland in seaborne trade?</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><span data-contrast="auto">What is the average age of ships in the Polish maritime transport fleet?</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><span data-contrast="auto">What are the key economic indicators for the Polish shipbuilding industry?</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-26bc5cf elementor-widget elementor-widget-text-editor" data-id="26bc5cf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW219856678 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW219856678 BCX0">As mentioned before, w</span><span class="NormalTextRun SCXW219856678 BCX0">e compared the following models:</span></span><span class="EOP SCXW219856678 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-890abbd elementor-widget elementor-widget-text-editor" data-id="890abbd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">OpenAI&#8217;s 4o-mini</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">OpenAI&#8217;s </span><span data-contrast="auto">o3-mini</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">DeepSeek&#8217;s </span><span data-contrast="auto">deepseek-resoner (R1)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-153e657 elementor-widget elementor-widget-text-editor" data-id="153e657" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW236391729 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW236391729 BCX0">We measured the following metrics:</span></span><span class="EOP SCXW236391729 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6ca549e elementor-widget elementor-widget-text-editor" data-id="6ca549e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Inference Time</span></b><span data-contrast="auto"> – This refers to the time it takes for the model to generate a response after receiving a prompt. A lower inference time means faster responses, which is crucial for real-time applications and large-scale document processing.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Token Usage</span></b><span data-contrast="auto"> – LLMs process and generate text in units called </span><i><span data-contrast="auto">tokens</span></i><span data-contrast="auto">. A token can be a word, part of a word, or even a punctuation mark. The total token usage includes both input tokens (the user’s query or document) and output tokens (the model’s generated response). The more tokens used, the higher the cost of the request.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Response Cost</span></b><span data-contrast="auto"> – This is calculated as </span><b><span data-contrast="auto">token usage × model pricing</span></b><span data-contrast="auto"> (per 1,000 or 1,000,000 tokens, depending on the provider). Since different models have different pricing structures, comparing response costs helps determine which model is more cost-effective for large-scale use cases.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-e807a52 elementor-widget elementor-widget-heading" data-id="e807a52" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Test Results </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-b36a1c6 elementor-widget elementor-widget-text-editor" data-id="b36a1c6" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW217432411 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW217432411 BCX0">Here are the summarized results from our tests</span><span class="NormalTextRun SCXW217432411 BCX0"> (each test was repeated several times)</span><span class="NormalTextRun SCXW217432411 BCX0">:</span></span><span class="EOP SCXW217432411 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f26f255 elementor-widget elementor-widget-html" data-id="f26f255" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
    table {
        width: 100%;
        border-collapse: collapse;
        font-family: 'Roboto', sans-serif;
        font-weight: 300;
        font-size: 14px;
        color: #1C244B;
    }
    th, td {
        border: 1px solid #ddd;
        padding: 8px;
        text-align: left;
    }
    th {
        background-color: #f4f4f4;
        color: #1C244B;
    }
    tr:nth-child(even) {
        background-color: #f9f9f9;
    }
</style>

<table>
    <thead>
        <tr>
            <th>Model</th>
            <th>Average Inference Time (s)</th>
            <th>Average Response Cost ($)</th>
            <th>Average Input Tokens</th>
            <th>Average Output Tokens</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td><strong>DeepSeek R1</strong></td>
            <td>57.2</td>
            <td>0.0039</td>
            <td>63961.7</td>
            <td>751.6</td>
        </tr>
        <tr>
            <td><strong>o3-mini</strong></td>
            <td>13.8</td>
            <td>0.0755</td>
            <td>63251.5</td>
            <td>1162.5</td>
        </tr>
        <tr>
            <td><strong>4o-mini</strong></td>
            <td>9.5</td>
            <td>0.0511</td>
            <td>62538.0</td>
            <td>1046.5</td>
        </tr>
    </tbody>
</table>

<!-- Link to Google Fonts -->
<link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
		</div>
				</div>
				<div class="elementor-element elementor-element-65630f9 elementor-widget elementor-widget-heading" data-id="65630f9" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Key Observations</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-703de17 elementor-widget elementor-widget-text-editor" data-id="703de17" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Inference Time</span></b><span data-contrast="auto">: DeepSeek consistently demonstrated longer inference times compared to both OpenAI models. This could be a significant factor for applications that prioritize fast processing.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Response Cost</span></b><span data-contrast="auto">: DeepSeek showed a competitive advantage in terms of cost, particularly for output tokens. Despite the longer inference time, DeepSeek’s overall cost per request remains lower than OpenAI o3-mini and 4o-mini. The lower response cost of DeepSeek can be attributed to its caching mechanism, which reduces the need to reprocess input data. Most of the input content, particularly the PDF file&#8217;s contents, was cached, leading to significant savings in processing costs. This caching system allowed DeepSeek to handle repeated queries more efficiently, making it a cost-effective option for processing large documents.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Output Variability</span></b><span data-contrast="auto">: The models varied in style and the level of detail in their responses. This is important depending on the context and user requirements (e.g., high-level summaries vs. detailed analysis).</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><b><span data-contrast="auto">LiteLLM API</span></b><span data-contrast="auto">: LiteLLM made it extremely easy to track cost, token usage, and response time directly from the API responses, enabling a straightforward comparison between models.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-f56bc1c elementor-widget elementor-widget-heading" data-id="f56bc1c" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-abbecc5 elementor-widget elementor-widget-text-editor" data-id="abbecc5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">Our tests highlight the advantages of using unified APIs for </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">LLM benchmarking</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">. </span><span class="NormalTextRun SpellingErrorV2Themed SCXW28694121 BCX0">LiteLLM</span><span class="NormalTextRun SCXW28694121 BCX0"> significantly simplified the process, allowing us to focus on </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">LLM efficiency assessment</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0"> and </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">evaluating AI language models</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">. While </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW28694121 BCX0">DeepSeek</span><span class="NormalTextRun SCXW28694121 BCX0"> R1</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0"> demonstrated </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">competitive cost-effectiveness</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">, particularly due to its caching mechanism, it was by far the </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">slowest model</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0"> in our tests, with an average inference time of </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">57.2 seconds</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">. In contrast, </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">OpenAI o3-mini</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0"> and </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">4o-mini</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0"> provided significantly </span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">faster response times</span></span><span class="TextRun SCXW28694121 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW28694121 BCX0">, making them more suitable for real-time applications.</span></span><span class="EOP TrackedChange SCXW28694121 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/openai-vs-deepseek-a-technical-comparison-using-unified-apis/">OpenAI vs. DeepSeek: A Technical Comparison Using Unified APIs</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7564</post-id>	</item>
		<item>
		<title>Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</title>
		<link>https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Tue, 04 Feb 2025 10:33:15 +0000</pubDate>
				<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[DeepEval]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Retrieval-Augmented Generation]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=6902</guid>

					<description><![CDATA[<p>This post explores how DeepEval helps systematically assess the effectiveness of both retrieval and generation components, ensuring more reliable machine-generated insights. </p>
<p>Artykuł <a href="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/">Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="6902" class="elementor elementor-6902" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-a77f132 e-flex e-con-boxed e-con e-parent" data-id="a77f132" data-element_type="container">
					<div class="e-con-inner">
		<div class="elementor-element elementor-element-eedef0f e-con-full e-flex e-con e-child" data-id="eedef0f" data-element_type="container">
				</div>
		<div class="elementor-element elementor-element-8bb2c58 e-con-full e-flex e-con e-child" data-id="8bb2c58" data-element_type="container">
		<div class="elementor-element elementor-element-cac0d92 e-con-full e-flex e-con e-child" data-id="cac0d92" data-element_type="container">
				<div class="elementor-element elementor-element-f3a0ecb elementor-widget elementor-widget-html" data-id="f3a0ecb" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-33c698c elementor-widget elementor-widget-text-editor" data-id="33c698c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4><span class="TextRun SCXW184211874 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW184211874 BCX0">Retrieval-Augmented Generation (RAG) models are transforming the capabilities of intelligent assistants, enabling more </span><span class="NormalTextRun SCXW184211874 BCX0">accurate</span><span class="NormalTextRun SCXW184211874 BCX0"> and context-aware responses to user queries. Unlike traditional large language models (LLMs), RAG-based systems integrate two essential components: a retrieval mechanism that fetches relevant documents and a generative model that synthesizes responses based on real-time </span><span class="NormalTextRun SCXW184211874 BCX0">data. This post explores how </span><span class="NormalTextRun SCXW184211874 BCX0">DeepEval</span><span class="NormalTextRun SCXW184211874 BCX0"> helps systematically assess the effectiveness of both retrieval and generation components, ensuring more reliable machine-generated insights.</span></span><span class="EOP TrackedChange SCXW184211874 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h4>						</div>
				</div>
				<div class="elementor-element elementor-element-6e6ea96 elementor-widget elementor-widget-text-editor" data-id="6e6ea96" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">While RAG-enhanced virtual assistants significantly improve answer relevance, evaluating their performance remains a challenge. Since these models rely on both retrieval and text generation, a weak document-fetching step can lead to misleading or incorrect responses, even if the underlying LLM is highly advanced.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We’ll</span> <span data-contrast="auto">demonstrate</span><span data-contrast="auto"> this process using our custom AI-driven assistant</span><span data-contrast="auto">, designed to answer complex queries about </span><span data-contrast="auto">maritime economy statistics</span><span data-contrast="auto">, </span><span data-contrast="auto">showcasing</span><span data-contrast="auto"> how </span><span data-contrast="auto">LLM-powered knowledge retrieval enhances data-driven decision-making.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d0eedd1 elementor-widget elementor-widget-heading" data-id="d0eedd1" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">SeaStat - Our AI Assistant </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-caacdce elementor-widget elementor-widget-text-editor" data-id="caacdce" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">A great example</span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0"> that we can use to discuss this topic is the </span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">SeaStat</span></span></span> <span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">AI Assistant</span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0"> developed by us as part of the Incone60 Green Project (https://www.incone60.eu/). The goal of the project is to improve the competitiveness and sustainable development of small seaports in the South Baltic region.</span></span></span><span class="EOP SCXW210561514 BCX0" data-ccp-props="{&quot;335551550&quot;:6,&quot;335551620&quot;:6}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-42941ab elementor-widget elementor-widget-image" data-id="42941ab" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="6904" data-permalink="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/seastat/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" data-orig-size="517,587" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SeaStat" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" tabindex="0" role="button" width="517" height="587" src="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" class="attachment-large size-large wp-image-6904" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png 517w, https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png 264w" sizes="(max-width: 517px) 100vw, 517px" data-attachment-id="6904" data-permalink="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/seastat/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" data-orig-size="517,587" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SeaStat" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-bf0da8d elementor-widget elementor-widget-text-editor" data-id="bf0da8d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentStart CommentHighlightPipeRestRefresh CommentHighlightRest SCXW10433028 BCX0">Duri</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">ng Incone60 Gren Project w</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">e have developed an AI assistant that answers questions about maritime economy data, providing instant access to structured maritime economic insights. This assistant </span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">leverages</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> a </span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">Retrieval-Augmented Generation (RAG)</span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> approach, ensuring that responses are grounded in a structured database covering key aspects such as </span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">seaports, maritime transport</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">,</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> shipbuilding, passenger traffic, trade, and the fishing industry</span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">.</span></span><span class="EOP CommentHighlightPipeRestRefresh SCXW10433028 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-85e0e3e elementor-widget elementor-widget-text-editor" data-id="85e0e3e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Our AI assistant operates within a RAG pipeline that integrates:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">A structured maritime economy database</span></b><span data-contrast="auto">, which includes global and Polish maritime statistics from 2017 to 2020. The data is sourced from publications by Gdynia Maritime University, which aggregate statistics from various government institutes, universities, and port enterprises. The database consists of 50 tables, covering key aspects of maritime transport and is planned to be further extended with additional years. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Dynamic SQL generation</span></b><span data-contrast="auto"> to extract relevant information from the database.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">A generative LLM</span></b><span data-contrast="auto"> that formulates answers based on the retrieved data.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">Building such an assistant requires several key decisions and parameter optimizations, including:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Selecting the most suitable LLM model and tuning parameters (e.g., temperature).</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">Designing an effective prompt structure.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">Ensuring the assistant consistently selects the most relevant tables from the dataset.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">This is where </span><b><span data-contrast="auto">automatic testing</span></b><span data-contrast="auto"> becomes crucial. It helps assess system performance, identify weaknesses, and ensure continuous improvement.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-648b32b elementor-widget elementor-widget-heading" data-id="648b32b" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">LLM-as-a-Judge: Automating RAG Model Evaluation  </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-09073bd elementor-widget elementor-widget-text-editor" data-id="09073bd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Evaluating systems that generate non-deterministic, open-ended text outputs can be challenging because there is often no single &#8220;correct&#8221; answer. While human evaluation is accurate, it can be costly and time-consuming.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><b><span data-contrast="auto">LLM-as-a-Judge</span></b><span data-contrast="auto"> is a method that approximates human evaluation by rating the system&#8217;s output based on custom criteria tailored to your specific application. One such testing framework is </span><b><span data-contrast="auto">DeepEval</span></b><span data-contrast="auto">, which provides a set of metrics designed for both retrieval and generation tasks and allows you to create your own rating criteria. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b4e1ef9 elementor-widget elementor-widget-text-editor" data-id="b4e1ef9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Key evaluation metrics are:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">G-Eval</span></b><span data-contrast="auto">: A versatile metric that evaluates LLM output based on custom-defined criteria.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Answer Relevancy</span></b><span data-contrast="auto">: Measures how well the model’s response addresses the user query.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Faithfulness</span></b><span data-contrast="auto">: Assesses how accurately the response aligns with the provided context, helping to limit hallucination in RAG systems.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><b><span data-contrast="auto">ContextualRecallMetric, ContextualPrecisionMetric, ContextualRelevancyMetric</span></b><span data-contrast="auto">: These metrics are particularly useful for RAG systems, evaluating whether retrieval components return all relevant context while avoiding irrelevant information.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1c250db elementor-widget elementor-widget-heading" data-id="1c250db" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Step-by-Step RAG Model Testing with DeepEval  </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-e3db2c9 elementor-widget elementor-widget-text-editor" data-id="e3db2c9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">To ensure the reliability and accuracy of our Retrieval-Augmented Generation (RAG) model, we follow a structured evaluation approach. </span></span></span><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">This process involves dataset creation, response generation, and model evaluation using </span></span></span><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">DeepEval</span></span></span><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">, allowing us to systematically assess the effectiveness of both retrieval and generation components.</span></span> <span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">Let’s</span><span class="NormalTextRun SCXW136457389 BCX0"> break down each step in detail.</span></span><span class="EOP SCXW136457389 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-019ed5f elementor-widget elementor-widget-heading" data-id="019ed5f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">1. Dataset Creation </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-741ed12 elementor-widget elementor-widget-text-editor" data-id="741ed12" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">To evaluate performance, we create a test set consisting of:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">&#8211; </span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">Realistic questions</span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0"> that users might ask. These can range from simple fact-based queries to more complex, multi-step inquiries that require detailed answers drawn from multiple tables.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">&#8211; </span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">Expected ground truth responses</span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0"> derived directly from the database.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f4f0b50 elementor-widget elementor-widget-heading" data-id="f4f0b50" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">2. Generating Model Responses </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-587939b elementor-widget elementor-widget-text-editor" data-id="587939b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW56801091 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW56801091 BCX0">For each test query, the assistant generates an answer based on the relevant data retrieved from the database.</span></span><span class="EOP SCXW56801091 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0deb934 elementor-widget elementor-widget-heading" data-id="0deb934" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">3. Evaluation using DeepEval </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-7c0b685 elementor-widget elementor-widget-text-editor" data-id="7c0b685" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">We are particularly focused on </span><b><span data-contrast="auto">factual correctness</span></b><span data-contrast="auto"> for our assistant, so we use the </span><b><span data-contrast="auto">G-Eval metric</span></b><span data-contrast="auto"> to evaluate this aspect.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We need to define G-Eval by describing testing criteria, e.g.:</span><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3d98b09 elementor-widget elementor-widget-text-editor" data-id="3d98b09" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">correctness_metric = GEval(    </span> <br /><span data-contrast="auto">    name="Correctness",     </span> <br /><span data-contrast="auto">    evaluation_steps=[  </span> <br /><span data-contrast="auto">        "Assess whether the actual output is accurate in terms of facts compared to the expected output.",      </span> <br /><span data-contrast="auto">        "Penalize missing information."  </span> <br /><span data-contrast="auto">    ],      </span> <br /><span data-contrast="auto">    evaluation_params=[  </span> <br /><span data-contrast="auto">       LLMTestCaseParams.INPUT,   </span> <br /><span data-contrast="auto">       LLMTestCaseParams.ACTUAL_OUTPUT,   </span> <br /><span data-contrast="auto">       LLMTestCaseParams.EXPECTED_OUTPUT  </span> <br /><span data-contrast="auto">    ],    </span> <br /><span data-contrast="auto">)</span> </pre>						</div>
				</div>
				<div class="elementor-element elementor-element-63d0764 elementor-widget elementor-widget-text-editor" data-id="63d0764" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW196212698 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW196212698 BCX0">Additionally, we use several built-in metrics:</span></span><span class="EOP SCXW196212698 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4344a23 elementor-widget elementor-widget-text-editor" data-id="4344a23" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_precision</span><span class="NormalTextRun SCXW8241585 BCX0"> = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualPrecisionMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_recall = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualRecallMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_relevancy = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualRelevancyMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">answer_relevancy = </span><span class="NormalTextRun SCXW8241585 BCX0">AnswerRelevancyMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">faithfulness = </span><span class="NormalTextRun SCXW8241585 BCX0">FaithfulnessMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="EOP SCXW8241585 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-5d2ddcf elementor-widget elementor-widget-text-editor" data-id="5d2ddcf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW77075170 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW77075170 BCX0">We then define test cases:</span></span><span class="EOP SCXW77075170 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f16e61d elementor-widget elementor-widget-text-editor" data-id="f16e61d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">test_case = LLMTestCase(  </span> <br /><span data-contrast="auto">    input=#user prompt,  </span> <br /><span data-contrast="auto">    actual_output=#model output here,  </span> <br /><span data-contrast="auto">    expected_output=#the ground truth response </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span><br /><br /><span data-contrast="auto">    retrieval_context=#data extracted by retriever, in our case it is data extracted from the database</span> <br /><span data-contrast="auto">)</span> <br /><span data-ccp-props="{}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-466c61d elementor-widget elementor-widget-text-editor" data-id="466c61d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW131448305 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW131448305 BCX0">Here is </span><span class="NormalTextRun SCXW131448305 BCX0">one of</span><span class="NormalTextRun SCXW131448305 BCX0"> test case</span><span class="NormalTextRun SCXW131448305 BCX0">s</span><span class="NormalTextRun SCXW131448305 BCX0"> we used to </span><span class="NormalTextRun SCXW131448305 BCX0">evaluate our </span><span class="NormalTextRun SCXW131448305 BCX0">SeaStat</span> <span class="NormalTextRun SCXW131448305 BCX0">Assitant</span><span class="NormalTextRun SCXW131448305 BCX0">:</span></span><span class="EOP SCXW131448305 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-53c56bc elementor-widget elementor-widget-text-editor" data-id="53c56bc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">test_case = LLMTestCase(  </span> <br /><span data-contrast="auto">    input='Compare cargo traffic in Suez Canal and Panama Canal in 2019',  </span> <br /><span data-contrast="auto">    actual_output= 'In 2019, the cargo traffic data for the Suez Canal and Panama Canal was as follows: Suez Canal - 1031 million tons; Panama Canal - 243059 thousand tons. The Suez Canal had significantly higher cargo traffic compared to the Panama Canal in 2019.' </span> <br /><span data-contrast="auto">    expected_output=' In 2019, the Suez Canal handled 1,031 million tons of cargo, whereas the Panama Canal transported only 243 million tons. This indicates that the Suez Canal carried a substantially higher volume of cargo than the Panama Canal that year.' </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span><br /><br /><span data-contrast="auto">    retrieval_context=[</span><span data-ccp-props="{}"> </span><br /><br /><span data-contrast="auto">{'table': 'Suez_Canal_Cargo_Traffic', 'year': 2019, 'cargo_volume_million_tons': 1031},</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span><br /><br /><span data-contrast="auto">{'table': 'Panama_Canal_Cargo_Traffic', 'year': 2019, 'direction': 'Atlantic – Pacific', 'cargo_volume_thousand_tons': 156899}, {'table': 'Panama_Canal_Cargo_Traffic', 'year': 2019, 'direction': 'Pacific – Atlantic', 'cargo_volume_thousand_tons': 86160}</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span><br /><br /><span data-contrast="auto">]</span> <br /><span data-contrast="auto">)</span> </pre>						</div>
				</div>
				<div class="elementor-element elementor-element-2c1ba07 elementor-widget elementor-widget-text-editor" data-id="2c1ba07" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW81219040 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW81219040 BCX0">And run evaluation:</span></span><span class="EOP SCXW81219040 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559731&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-fb89c70 elementor-widget elementor-widget-text-editor" data-id="fb89c70" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">assert_test(test_case, [correctness_metric, answer_relevancy, contextual_precision, contextual_recall, contextual_relevancy, faithfulness])</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559731&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-9893049 elementor-widget elementor-widget-heading" data-id="9893049" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">4. Testing results </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-283d669 elementor-widget elementor-widget-text-editor" data-id="283d669" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">DeepEval assigns each metric a score between 0 and 1, accompanied by a descriptive explanation of the rating. Below are the results from a test case evaluating SeaStat&#8217;s response to the prompt:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><b><span data-contrast="auto">&#8220;Compare cargo traffic in the Suez Canal and Panama Canal in 2019.&#8221;</span></b><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Metric interpretations:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Contextual Recall</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> &#8211; The retriever effectively retrieved the necessary information, meaning that almost all essential details from the expected output were present in the retrieval context.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Contextual Relevancy (0.95)</span></b><span data-contrast="auto"> and </span><b><span data-contrast="auto">Contextual Precision (1.0)</span></b><span data-contrast="auto"> &#8211; The retrieved context was highly relevant to the query, showing that the retriever pulled information accurately related to the input.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Faithfulness</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> &#8211; The model’s response remained perfectly factual, strictly adhering to the retrieved information without introducing any hallucinations.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Answer Relevancy</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> – The model&#8217;s response fully addressed the user query, ensuring that the answer was on point.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Correctness</span></b><span data-contrast="auto">, </span><b><span data-contrast="auto">(0.78)</span></b><span data-contrast="auto"> – the correctness score was slightly lower due to numerical discrepancies caused by rounding.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">By systematically analyzing test cases with DeepEval, we gain valuable insights into where our RAG model excels and where improvements are needed. Future optimizations could include refining retrieval strategies, adjusting prompt engineering, or fine-tuning LLM parameters for better factual accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0445df6 elementor-widget elementor-widget-text-editor" data-id="0445df6" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<table style="font-weight: 400;" data-tablestyle="MsoTableGrid" data-tablelook="1696" aria-rowcount="7"><tbody><tr aria-rowindex="1"><td data-celllook="0"><p><b><span data-contrast="auto">Test case</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Metric</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Score</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Status</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Overall Success Rate</span></b><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="2"><td colspan="1" rowspan="6" data-celllook="0"><p><span data-contrast="auto">test_case_0</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">Correctness (GEval)</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">0.78 (threshold=0.5, evaluation model=gpt-4o, reason=The actual output closely matches the expected output in terms of cargo volumes and comparative conclusion, but the numbers are expressed in different units (thousand tons vs million tons) and slightly differ, which may indicate rounding or conversion discrepancies., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td><td colspan="1" rowspan="6" data-celllook="0"><p><span data-contrast="auto">100%</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="3"><td data-celllook="0"><p><span data-contrast="auto">Answer Relevancy</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because the response thoroughly addressed the comparison of cargo traffic in the Suez Canal and the Panama Canal in 2019 with no irrelevant details included. It&#8217;s precise and to the point, showcasing a deep understanding of the topic., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="4"><td data-celllook="0"><p><span data-contrast="auto">Contextual Precision</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because the relevant nodes, offering essential data for comparing cargo traffic in the Suez and Panama Canals in 2019, are perfectly ranked at the top. These nodes effectively deliver a comprehensive breakdown of cargo volumes through both canals during that year, ensuring accurate comparisons can be made efficiently., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="5"><td data-celllook="0"><p><span data-contrast="auto">Contextual Recall</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because every sentence in the expected output aligns perfectly with the data from the nodes in the retrieval context, effectively illustrating the significant difference in cargo volumes handled by both canals. Well done on maintaining precise and accurate attention to detail!, error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="6"><td data-celllook="0"><p><span data-contrast="auto">Contextual Relevancy</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">0.95 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 0.95 because although the context is rich with detailed data on Suez Canal cargo traffic, it lacks specific information on the Panama Canal&#8217;s cargo traffic, necessitating additional data for a complete comparison., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="7"><td data-celllook="0"><p><span data-contrast="auto">Faithfulness</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=Awesome job! The score is 1.00 because there are no contradictions present, showcasing perfect alignment and faithfulness of the actual output to the retrieval context. Keep up the excellent work!, error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr></tbody></table>						</div>
				</div>
				<div class="elementor-element elementor-element-abdf550 elementor-widget elementor-widget-text-editor" data-id="abdf550" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Evaluating Retrieval-Augmented Generation (RAG) models requires a structured approach to ensure both retrieval accuracy and response reliability. </span><span data-contrast="auto">LLM-as-a-Judge</span> <span data-contrast="auto">provides</span><span data-contrast="auto"> an efficient alternative to human evaluation by systematically assessing outputs based on predefined criteria, enabling scalable and cost-effective validation.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Using </span><span data-contrast="auto">DeepEval</span><span data-contrast="auto">, we tested our AI-driven </span><span data-contrast="auto">SeaStat</span><span data-contrast="auto"> Assistant</span><span data-contrast="auto"> against key evaluation metrics, including </span><span data-contrast="auto">Correctness (G-Eval), Answer Relevancy, Contextual Precision, Contextual Recall, Contextual Relevancy, and Faithfulness</span><span data-contrast="auto">. The results highlighted </span><span data-contrast="auto">minor discrepancies in numerical representation, missing contextual details, and retrieval precision—insights crucial f</span><span data-contrast="auto">o</span><span data-contrast="auto">r refining model performance.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">These findings emphasize that </span><span data-contrast="auto">even high-performing RAG models require rigorous evaluation to ensure factual accuracy and prevent misleading outputs</span><span data-contrast="auto">. By automating this process, we enable continuous model improvement, ensuring </span><span data-contrast="auto">AI-driven assistants deliver reliable, context-aware insights at scale</span><span data-contrast="auto">.</span> <span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">AI-powered assistants are undoubtedly a technology that will become an indispensable tool for employees at all levels—from executives and directors to specialists. Their dynamic development allows them to instantly adapt to business needs and evolving expectations.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-308ac2d elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="308ac2d" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<div class="elementor-cta">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2024/12/3-1030x1030.png);" role="img" aria-label="3"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We create reliable AI assistants					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						If you're looking for a company to help you implement an AI-based solution, reach out to us. We’d be happy to discuss your idea.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<a class="elementor-cta__button elementor-button elementor-size-" href="https://inero-software.com/contact-us/">
						Contact Us					</a>
					</div>
							</div>
						</div>
				</div>
				</div>
				</div>
				</div>
		<div class="elementor-element elementor-element-961021e e-con-full e-flex e-con e-child" data-id="961021e" data-element_type="container">
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/">Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6902</post-id>	</item>
		<item>
		<title>Running AI in client-side: Real-Time Face Detection in the Browser Using YOLO and TensorFlow.js &#8211; use case study</title>
		<link>https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Wed, 09 Oct 2024 12:32:00 +0000</pubDate>
				<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[ML]]></category>
		<category><![CDATA[TensorFlow.js]]></category>
		<category><![CDATA[yolo]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=6202</guid>

					<description><![CDATA[<p>In this post, we'll explore how to implement object detection directly in the browser using YOLO </p>
<p>Artykuł <a href="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/">Running AI in client-side: Real-Time Face Detection in the Browser Using YOLO and TensorFlow.js &#8211; use case study</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="6202" class="elementor elementor-6202" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-d225397 e-flex e-con-boxed e-con e-parent" data-id="d225397" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-4df78db elementor-widget elementor-widget-html" data-id="4df78db" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
					</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-0827b19 e-flex e-con-boxed e-con e-parent" data-id="0827b19" data-element_type="container">
					<div class="e-con-inner">
		<div class="elementor-element elementor-element-afa200d e-con-full e-flex e-con e-child" data-id="afa200d" data-element_type="container">
				</div>
		<div class="elementor-element elementor-element-3302212 e-con-full e-flex e-con e-child" data-id="3302212" data-element_type="container">
				<div class="elementor-element elementor-element-640ca1b elementor-widget elementor-widget-text-editor" data-id="640ca1b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW252636194 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">With the growing demand for real-time applications, running deep learning models in the browser has become more accessible and powerful. </span> </span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">In this post, we&#8217;ll explore how to implement </span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">object detection</span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0"> directly in the browser using </span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">YOLO (You Only Look Once)</span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0"> and </span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">TensorFlow.js</span></span><span class="TextRun SCXW252636194 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW252636194 BCX0">.</span></span><span class="TextRun SCXW252636194 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"> <span class="NormalTextRun SCXW252636194 BCX0">We will focus specifically on using a custom-trained YOLOv8 model for detecting human faces. By the end of this guide, </span><span class="NormalTextRun SCXW252636194 BCX0">you&#8217;ll</span><span class="NormalTextRun SCXW252636194 BCX0"> learn how to set up and run the YOLO model for face detection using the TensorFlow.js library, process the results, and </span><span class="NormalTextRun SCXW252636194 BCX0">optimize</span><span class="NormalTextRun SCXW252636194 BCX0"> its performance—all without needing a server or backend processing.</span></span><span class="EOP SCXW252636194 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-aed25a2 elementor-widget elementor-widget-heading" data-id="aed25a2" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Why Run Neural Networks in the Browser? </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-344fad5 elementor-widget elementor-widget-text-editor" data-id="344fad5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Running neural networks in the browser offers several advantages:</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><ol><li><b><span data-contrast="auto">Low Latency</span></b><span data-contrast="auto">: Everything happens client-side, avoiding the delay of sending data to a server and waiting for a response.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></li><li><b><span data-contrast="auto">Enhanced Privacy</span></b><span data-contrast="auto">: Sensitive data remains on the user&#8217;s device, reducing the risk of breaches or exposure.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></li><li><b><span data-contrast="auto">Offline Capabilities</span></b><span data-contrast="auto">: Users can access machine learning functionalities without a continuous internet connection.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></li><li><b><span data-contrast="auto">Cross-Platform Compatibility</span></b><span data-contrast="auto">: Your application can run on any device with a browser—desktop, tablet, or smartphone.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-9e13d20 elementor-widget elementor-widget-heading" data-id="9e13d20" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Choosing and Preparing Your Neural Network </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-769829b elementor-widget elementor-widget-text-editor" data-id="769829b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">When selecting a neural network for browser implementation, consider factors like model size, speed, memory usage, and compatibility with browser technologies such as WebGL. For optimal performance on resource-limited hardware, it&#8217;s recommended to use models smaller than 30MB. Suitable models include MobileNetV2, SqueezeNet, EfficientNet, and certain YOLO variants. In our case, we opted for a custom-trained YOLOv8 model for detecting human faces in images.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><p><span data-contrast="auto">If your model exceeds the recommended size, consider optimization techniques like quantization and pruning. </span><b><span data-contrast="auto">Quantization</span></b><span data-contrast="auto"> reduces the precision of the model&#8217;s weights, typically converting 32-bit floating-point values to 16-bit or 8-bit integers. </span><b><span data-contrast="auto">Pruning</span></b><span data-contrast="auto"> removes redundant connections in the neural network. Both methods shrink the model and reduce computational complexity, enhancing inference speed—especially on devices like smartphones—though they may slightly affect accuracy.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-76a71b3 elementor-widget elementor-widget-heading" data-id="76a71b3" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Optimizing YOLOv8 for Face Detection: Results from Our Custom Model </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-9111e72 elementor-widget elementor-widget-text-editor" data-id="9111e72" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW165867707 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW165867707 BCX0">Our model is a YOLOv8 model trained on custom dataset to streamline document workflows. The goal was to automatically verify whether an attachment </span><span class="NormalTextRun SCXW165867707 BCX0">contains</span><span class="NormalTextRun SCXW165867707 BCX0"> a clear, front-facing human photo, with no obstructions like masks. This is crucial for processes like ID verification, where the visibility of the person&#8217;s face is essential. Our dataset consisted of 1,500 images, split into 1,200 for training and 300 for validation. This allowed the model to learn how to distinguish between acceptable and unacceptable photos, ensuring accuracy in real-world use cases. </span><span class="NormalTextRun SCXW165867707 BCX0">The following images </span><span class="NormalTextRun SCXW165867707 BCX0">demonstrate</span><span class="NormalTextRun SCXW165867707 BCX0"> how the network functions in practice.</span></span><span class="EOP SCXW165867707 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-fd798a3 elementor-widget elementor-widget-image" data-id="fd798a3" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="6206" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr1/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg" data-orig-size="934,258" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1-300x83.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg" tabindex="0" role="button" width="934" height="258" src="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg" class="attachment-large size-large wp-image-6206" alt="" srcset="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg 934w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr1-300x83.jpg 300w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr1-768x212.jpg 768w" sizes="(max-width: 934px) 100vw, 934px" data-attachment-id="6206" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr1/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg" data-orig-size="934,258" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1-300x83.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr1.jpg" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-34bbb24 elementor-widget elementor-widget-heading" data-id="34bbb24" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<p class="elementor-heading-title elementor-size-default">(source of images: https://www.kaggle.com/datasets/ashwingupta3012/human-faces, https://www.kaggle.com/datasets/andrewmvd/face-mask-detection) </p>		</div>
				</div>
				<div class="elementor-element elementor-element-12bfcb4 elementor-widget elementor-widget-text-editor" data-id="12bfcb4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW245593581 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW245593581 BCX0">Inference results on four examples – the two faces on the left were detected correctly, while the two on the right were not, as they were partially covered.</span></span></p><p><span class="TextRun SCXW206579471 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW206579471 BCX0">As a baseline for our </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW206579471 BCX0">model</span><span class="NormalTextRun SCXW206579471 BCX0"> we </span><span class="NormalTextRun SCXW206579471 BCX0">selected</span><span class="NormalTextRun SCXW206579471 BCX0"> YOLOv8s (small)</span><span class="NormalTextRun SCXW206579471 BCX0">, which resulted in a model size of 44 MB and achieved</span></span> <span class="TextRun SCXW206579471 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW206579471 BCX0">99.9% precision and 99.1% recall on our custom validation dataset.</span></span> <span class="TextRun SCXW206579471 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW206579471 BCX0">To explore model optimizations, we also tested a smaller baseline model, YOLOv8n (nano), and examined the effects of model quantization. Training with the YOLOv8n baseline produced a model sized at just 12 MB, with </span><span class="NormalTextRun SCXW206579471 BCX0">nearly identical</span><span class="NormalTextRun SCXW206579471 BCX0"> accuracy metrics (99.7% precision and 99.1% recall).</span></span><span class="TextRun SCXW206579471 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"> <span class="NormalTextRun SCXW206579471 BCX0">Next, we performed quantization on both models, the resultant model size and accuracy is shown in the table below: </span></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-859298c elementor-widget elementor-widget-text-editor" data-id="859298c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<table class=" aligncenter" style="font-weight: 400;" data-tablestyle="MsoNormalTable" data-tablelook="1568" aria-rowcount="4"><tbody><tr aria-rowindex="1"><td colspan="1" rowspan="2" data-celllook="69905"><p><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td colspan="3" data-celllook="69905"><p><b><span data-contrast="auto">Baseline model</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td colspan="3" data-celllook="69905"><p><b><span data-contrast="auto">16-bit quantized model</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td></tr><tr aria-rowindex="2"><td data-celllook="69905"><p><b><span data-contrast="auto">Size</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">Precision</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">Recall</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">Size</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">Precision</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">Recall</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td></tr><tr aria-rowindex="3"><td data-celllook="69905"><p><b><span data-contrast="auto">YOLOv8 small</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">44 MB</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.999</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.991</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">22 MB</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.997</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.991</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td></tr><tr aria-rowindex="4"><td data-celllook="69905"><p><b><span data-contrast="auto">YOLOv8 nano</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">12 MB</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.997</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.991</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><b><span data-contrast="auto">6 MB</span></b><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.989</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td><td data-celllook="69905"><p><span data-contrast="auto">0.991</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p></td></tr></tbody></table>						</div>
				</div>
				<div class="elementor-element elementor-element-a7be2ba elementor-widget elementor-widget-text-editor" data-id="a7be2ba" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><b><span data-contrast="auto">Note</span></b><span data-contrast="auto">: Recall measures how many actual positive samples were correctly identified, while precision indicates how many predicted positives were truly positive. For ideal case, they both are equal to 1.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><p><span data-contrast="auto">In our example, using a smaller baseline model with quantization reduced accuracy by less than 1%, while shrinking the model size from 44 MB to just 6 MB.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><p data-ccp-border-bottom="0px none #000000" data-ccp-padding-bottom="0px" data-ccp-border-between="0px none #000000" data-ccp-padding-between="0px"><span data-contrast="auto">Below are several example photos that illustrate how two networks: YOLOv8s and YOLOv8n with quantization operate:</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ba300a9 elementor-widget elementor-widget-image" data-id="ba300a9" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="6208" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr2/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg" data-orig-size="934,307" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2-300x99.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg" tabindex="0" role="button" width="934" height="307" src="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg" class="attachment-large size-large wp-image-6208" alt="" srcset="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg 934w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr2-300x99.jpg 300w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr2-768x252.jpg 768w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr2-913x300.jpg 913w" sizes="(max-width: 934px) 100vw, 934px" data-attachment-id="6208" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr2/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg" data-orig-size="934,307" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2-300x99.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr2.jpg" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-3d14285 elementor-widget elementor-widget-text-editor" data-id="3d14285" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW46079478 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW46079478 BCX0">The results of inference with YOLOv8s model, without quantization (of size 44 MB)</span><span class="NormalTextRun SCXW46079478 BCX0">. (source of images: </span><span class="NormalTextRun SCXW46079478 BCX0">https://www.kaggle.com/datasets/ashwingupta3012/human-faces</span><span class="NormalTextRun SCXW46079478 BCX0">).</span></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-e4f1955 elementor-widget elementor-widget-image" data-id="e4f1955" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="6207" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr3/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg" data-orig-size="934,313" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3-300x101.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg" tabindex="0" role="button" width="934" height="313" src="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg" class="attachment-large size-large wp-image-6207" alt="" srcset="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg 934w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr3-300x101.jpg 300w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr3-768x257.jpg 768w, https://inero-software.com/wp-content/uploads/2024/10/9102024gr3-895x300.jpg 895w" sizes="(max-width: 934px) 100vw, 934px" data-attachment-id="6207" data-permalink="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/9102024gr3/" data-orig-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg" data-orig-size="934,313" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="9102024gr3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3-300x101.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2024/10/9102024gr3.jpg" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-bba7a47 elementor-widget elementor-widget-text-editor" data-id="bba7a47" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<table style="font-weight: 400;" data-tablestyle="MsoTableGrid" data-tablelook="1696" aria-rowcount="5"><tbody><tr aria-rowindex="1"><td colspan="1" rowspan="2" data-celllook="0"><p><span data-ccp-props="{}"> </span></p></td><td colspan="3" data-celllook="0"><p><b><span data-contrast="auto">Loading model</span></b><span data-ccp-props="{}"> </span></p></td><td colspan="3" data-celllook="0"><p><b><span data-contrast="auto">Single inference</span></b><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="2"><td data-celllook="0"><p><b><span data-contrast="auto">CPU 1</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">CPU 2</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">CPU 3</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">CPU 1</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">CPU 2</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">CPU 3</span></b><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="3"><td data-celllook="0"><p><b><span data-contrast="auto">YOLOv8 small</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1050 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">3700 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">4200 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">21 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">117.5 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">196.5 ms</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="4"><td data-celllook="0"><p><b><span data-contrast="auto">YOLOv8 nano 16-bit</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">980 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">3200 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">3700 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">16 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">112.5 ms</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">189 ms</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="5"><td data-celllook="0"><p><b><span data-contrast="auto">Time improvement</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">6.7 %</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">13.5 %</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">11.9 %</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">23.8 %</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">4.2 %</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">3.8 %</span><span data-ccp-props="{}"> </span></p></td></tr></tbody></table>						</div>
				</div>
				<div class="elementor-element elementor-element-8185505 elementor-widget elementor-widget-text-editor" data-id="8185505" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p data-ccp-border-bottom="0px none #000000" data-ccp-padding-bottom="0px" data-ccp-border-between="0px none #000000" data-ccp-padding-between="0px"><span data-contrast="auto">The results of inference with YOLOv8n model, with 16-bit quantization (of size 6 MB). There is only slight difference in confidence level, while location of bounding boxes is the same.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></p><p data-ccp-border-bottom="0px none #000000" data-ccp-padding-bottom="0px" data-ccp-border-between="0px none #000000" data-ccp-padding-between="0px"><span data-contrast="auto">We tested the performance of two models—YOLOv8 small (44 MB) and YOLOv8 nano 16 bit quantized (6 MB)—across three different CPUs. The smaller model, YOLOv8 nano, consistently outperformed its larger counterpart in terms of both loading times and inference speed. Detailed performance metrics, including CPU-based loading and inference times, are summarized in the table above.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cc12989 elementor-widget elementor-widget-text-editor" data-id="cc12989" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW10200005 BCX0" lang="EN-US" xml:lang="EN-US" data-contrast="auto"><span class="NormalTextRun SCXW10200005 BCX0">In addition to the CPU-based loading and inference times, another key factor to consider is the time </span><span class="NormalTextRun SCXW10200005 BCX0">required</span><span class="NormalTextRun SCXW10200005 BCX0"> to download the models, which is not included in the table. Download times are directly proportional to the model size and are heavily influenced by the user’s network speed.</span></span><span class="EOP SCXW10200005 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d6f4734 elementor-widget elementor-widget-heading" data-id="d6f4734" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Getting Started: Setting up TensorFlow.js </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-e00fb33 elementor-widget elementor-widget-text-editor" data-id="e00fb33" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW203997583 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW203997583 BCX0">To deploy your machine learning model in the browser, we’ll use </span></span><span class="TextRun SCXW203997583 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW203997583 BCX0">TensorFlow.js</span></span><span class="TextRun SCXW203997583 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW203997583 BCX0">, a powerful library that allows you to run pre-trained models or train new ones entirely in the browser. In this guide, we’ll focus on deploying a pre-trained YOLOv8 model for face detection. Below is a step-by-step guide to get TensorFlow.js set up and running.</span></span><span class="EOP SCXW203997583 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d18e614 elementor-widget elementor-widget-heading" data-id="d18e614" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">1. Install TensorFlow.js </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-cb9241f elementor-widget elementor-widget-text-editor" data-id="cb9241f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW103495611 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW103495611 BCX0">If you&#8217;re using a JavaScript bundler like Webpack or Parcel, you can install TensorFlow.js via npm:</span></span><span class="EOP SCXW103495611 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-c5deef0 elementor-widget elementor-widget-text-editor" data-id="c5deef0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre>npm install @tensorflow/tfjs </pre>						</div>
				</div>
				<div class="elementor-element elementor-element-cd338d1 elementor-widget elementor-widget-heading" data-id="cd338d1" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">2. Load the Model </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-d54d572 elementor-widget elementor-widget-text-editor" data-id="d54d572" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW90093782 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW90093782 BCX0">Since we&#8217;re using the TensorFlow.js library, you need to convert your model to the TensorFlow.js (Tf.js) format. In case of YOLO models, Ultralytics provides an easy way to achieve this with a simple command:</span></span><span class="EOP SCXW90093782 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8acf4f8 elementor-widget elementor-widget-text-editor" data-id="8acf4f8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW164933469 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW164933469 BCX0">yolo </span></span><span class="TextRun Highlight SCXW164933469 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW164933469 BCX0">export</span></span><span class="TextRun Highlight SCXW164933469 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW164933469 BCX0"> model=path/to/best.pt format=tfjs</span></span><span class="EOP SCXW164933469 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-a9c3625 elementor-widget elementor-widget-text-editor" data-id="a9c3625" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW204512639 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW204512639 BCX0">Once converted, your model will be saved as binary files along with a JSON file called </span></span><span class="TextRun SCXW204512639 BCX0" lang="PL" style="color: #339966;" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW204512639 BCX0">model.json</span></span><span class="TextRun SCXW204512639 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW204512639 BCX0">. These files can be loaded into your application using the </span></span><span class="TextRun SCXW204512639 BCX0" lang="PL" style="color: #339966;" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW204512639 BCX0">tf.loadGraphModel()</span></span><span class="TextRun SCXW204512639 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW204512639 BCX0"> function. Here’s an example of how to load the model, including warming it up with dummy input for performance:</span></span><span class="EOP SCXW204512639 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8ae265d elementor-widget elementor-widget-text-editor" data-id="8ae265d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">export</span></span> <span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">async</span></span> <span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">function</span></span> <span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">loadModel</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">(modelPath) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">  </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">try</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">// Load the model using a URL</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">const</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> model = </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">await</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> tf.loadGraphModel(</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">`${modelPath}/model.json`</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">// Warm up the model</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">const</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> dummyInput = tf.ones(model.inputs[</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">0</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">].shape);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">await</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> model.execute(dummyInput);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">return</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> model;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">  } </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">catch</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0"> (error) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">    </span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">throw</span></span> <span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">new</span></span> <span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">Error</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">(</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">`Failed to load model: ${error.message}`</span></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">  }</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW28304209 BCX0"><span class="SCXW28304209 BCX0"> </span><br class="SCXW28304209 BCX0" /></span><span class="TextRun Highlight SCXW28304209 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW28304209 BCX0">}</span></span><span class="EOP SCXW28304209 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-4fb7ba0 elementor-widget elementor-widget-heading" data-id="4fb7ba0" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">3. Prepare the Input </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-bcd536f elementor-widget elementor-widget-text-editor" data-id="bcd536f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Before running the model, we need to preprocess the input image. YOLO models expect images of a specific size, and ensuring that the input meets these requirements is crucial. Instead of merely resizing the image, we recommend a more sophisticated preprocessing method that maintains the aspect ratio and applies letterbox padding. This approach is consistent with the preprocessing used by Ultralytics during the training of the YOLO model.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><p><span data-contrast="auto">The function below resizes, pads, and normalizes the input image to match the model&#8217;s required input size:</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-dd83945 elementor-widget elementor-widget-text-editor" data-id="dd83945" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">function</span></span> <span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">preprocessImage</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">(base64Image, imgSize) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> image = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">new</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> Image();</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  image.src = base64Image;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> canvas = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">document</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.createElement(</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">'canvas'</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  canvas.width = image.width;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  canvas.height = image.height;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> ctx = canvas.getContext(</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">'2d'</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  ctx.drawImage(image, </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">, </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">, image.width, image.height);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Convert canvas image to a tensor</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">let</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> imgTensor = tf.browser.fromPixels(canvas);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Determine rescale factor</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> xFactor = image.width / imgSize;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> yFactor = image.height / imgSize;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> factor = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.max(xFactor, yFactor);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> newWidth = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.round(image.width / factor);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> newHeight = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.round(image.height / factor);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Resize to expected input shape </span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  imgTensor = tf.image.resizeBilinear(imgTensor, [newHeight, newWidth]);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Add padding</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> xPad = (imgSize - newWidth) / </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">2</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> yPad = (imgSize - newHeight) / </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">2</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> top = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.floor(yPad);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">co</span><span class="NormalTextRun SCXW147374849 BCX0">nst</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> bottom = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.ceil(yPad);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> left = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.floor(xPad);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">const</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> right = </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">Math</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">.ceil(xPad);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  imgTensor = tf.pad(imgTensor, [[top, bottom], [left, right], [</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">, </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">]], </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">114</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Normalize pixel values</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  imgTensor = imgTensor.div(</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">255.0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">).expandDims(</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">0</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">); </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">// Add batch dimension</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">  </span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">return</span></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0"> { imgTensor, left, top, factor };</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW147374849 BCX0"><span class="SCXW147374849 BCX0"> </span><br class="SCXW147374849 BCX0" /></span><span class="TextRun Highlight SCXW147374849 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW147374849 BCX0">}</span></span><span class="EOP SCXW147374849 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-183b6cf elementor-widget elementor-widget-heading" data-id="183b6cf" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">4. Run Inference </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-639b754 elementor-widget elementor-widget-text-editor" data-id="639b754" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW162333265 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW162333265 BCX0">With the model loaded and the input preprocessed, we can now run inference to detect faces in the image:</span></span><span class="EOP SCXW162333265 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-25f3c6b elementor-widget elementor-widget-text-editor" data-id="25f3c6b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW75192197 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW75192197 BCX0">const</span></span><span class="TextRun Highlight SCXW75192197 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW75192197 BCX0"> prediction = </span></span><span class="TextRun Highlight SCXW75192197 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW75192197 BCX0">await</span></span><span class="TextRun Highlight SCXW75192197 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW75192197 BCX0"> model.execute(inputTensor);</span></span><span class="EOP SCXW75192197 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-0830169 elementor-widget elementor-widget-heading" data-id="0830169" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">5. Postprocess the Model Output </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-aeca953 elementor-widget elementor-widget-text-editor" data-id="aeca953" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW196843037 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW196843037 BCX0">The YOLO network output is a tensor that needs to be properly interpreted. Below are the steps in our </span></span><span class="TextRun SCXW196843037 BCX0" lang="PL" style="color: #008000;" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW196843037 BCX0">postprocessInferenceResults()</span></span><span class="TextRun SCXW196843037 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW196843037 BCX0"> function to extract the coordinates of all bounding boxes, classes, and confidence scores:</span></span><span class="EOP SCXW196843037 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-04c9010 elementor-widget elementor-widget-text-editor" data-id="04c9010" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre data-ccp-border-bottom="0px none #000000" data-ccp-padding-bottom="0px" data-ccp-border-between="0px none #000000" data-ccp-padding-between="0px"><span data-contrast="none">const</span><span data-contrast="none"> results = prediction.transpose([</span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">2</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">]); </span> <br /><span data-contrast="none">const</span><span data-contrast="none"> numClass = </span><span data-contrast="none">1</span><span data-contrast="none">; </span><span data-contrast="none">// Only one class in our case</span> <br /><span data-contrast="none">const</span><span data-contrast="none"> boxes = tf.tidy(() =&gt; {</span> <br /><span data-contrast="none">const</span><span data-contrast="none"> w = results.slice([</span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">2</span><span data-contrast="none">], [</span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">]); </span><span data-contrast="none">// Get width</span> <br /><span data-contrast="none">const</span><span data-contrast="none"> h = results.slice([</span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">3</span><span data-contrast="none">], [</span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">]); </span><span data-contrast="none">// Get height</span> <br /><span data-contrast="none">const</span><span data-contrast="none"> x1 = tf.sub(results.slice([</span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">0</span><span data-contrast="none">], [</span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">]), tf.div(w, </span><span data-contrast="none">2</span><span data-contrast="none">)); </span><span data-contrast="none">// Get x1</span><span data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span><br /><br /><span data-contrast="none">const</span><span data-contrast="none"> y1 = tf.sub(results.slice([</span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">0</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">], [</span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">-1</span><span data-contrast="none">, </span><span data-contrast="none">1</span><span data-contrast="none">]), tf.div(h, </span><span data-contrast="none">2</span><span data-contrast="none">)); </span><span data-contrast="none">// Get y1</span> <br /><span data-contrast="none">return</span><span data-contrast="none"> tf.concat([y1, x1, y1.add(h), x1.add(w)], </span><span data-contrast="none">2</span><span data-contrast="none">).squeeze();</span> <br /><span data-contrast="none">});</span><span data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-70bdd06 elementor-widget elementor-widget-text-editor" data-id="70bdd06" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">To extract classes and confidence scores:</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4e859a7 elementor-widget elementor-widget-text-editor" data-id="4e859a7" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">const</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0"> numClass = labels.length;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW92870568 BCX0"><span class="SCXW92870568 BCX0"> </span><br class="SCXW92870568 BCX0" /></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">const</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0"> [scores, classes] = tf.tidy(() =&gt; {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW92870568 BCX0"><span class="SCXW92870568 BCX0"> </span><br class="SCXW92870568 BCX0" /></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">const</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0"> rawData = results.slice([</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">0</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">, </span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">0</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">, </span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">4</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">], [</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">-1</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">, </span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">-1</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">, numClass]).squeeze(</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">0</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW92870568 BCX0"><span class="SCXW92870568 BCX0"> </span><br class="SCXW92870568 BCX0" /></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">  </span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">return</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0"> [rawData.max(</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">1</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">), rawData.argMax(</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">1</span></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">)];</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW92870568 BCX0"><span class="SCXW92870568 BCX0"> </span><br class="SCXW92870568 BCX0" /></span><span class="TextRun Highlight SCXW92870568 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW92870568 BCX0">});</span></span><span class="EOP SCXW92870568 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c29f2bf elementor-widget elementor-widget-text-editor" data-id="c29f2bf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW52052999 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW52052999 BCX0">Next, filter out detections with confidence scores below a threshold (0.4) and overlapping bounding boxes:</span></span><span class="EOP SCXW52052999 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-2bbe314 elementor-widget elementor-widget-text-editor" data-id="2bbe314" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">const</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> array = </span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">await</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> scores.array();</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">const</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> highConfidenceIndices = array.reduce((acc, value, index) =&gt; {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">  </span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">if</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> (value &gt; </span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">0.4</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">) acc.push(index);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">  </span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">return</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> acc;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">}, []);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">const</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> highConfidenceBoxes = boxes.gather(highConfidenceIndices);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">const</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> highConfidenceScores = scores.gather(highConfidenceIndices);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW227521811 BCX0"><span class="SCXW227521811 BCX0"> </span><br class="SCXW227521811 BCX0" /></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0">const</span></span><span class="TextRun Highlight SCXW227521811 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW227521811 BCX0"> highConfidenceClasses = classes.gather(highConfidenceIndices);</span></span><span class="EOP SCXW227521811 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-6d455b0 elementor-widget elementor-widget-text-editor" data-id="6d455b0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW186806948 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW186806948 BCX0">Finally, apply Non-Max Suppression (NMS) to filter out overlapping bounding boxes:</span></span><span class="EOP SCXW186806948 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-a51ed9c elementor-widget elementor-widget-text-editor" data-id="a51ed9c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">const</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0"> nms = </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">await</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0"> tf.image.nonMaxSuppressionAsync(highConfidenceBoxes, highConfidenceScores, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">40</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">0.45</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">0.4</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">); </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">// NMS to filter boxes</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW150352841 BCX0"><span class="SCXW150352841 BCX0"> </span><br class="SCXW150352841 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW150352841 BCX0"><span class="SCXW150352841 BCX0"> </span><br class="SCXW150352841 BCX0" /></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">const</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0"> boxesData = highConfidenceBoxes.gather(nms, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">0</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">); </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">// Indexing boxes by NMS index</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW150352841 BCX0"><span class="SCXW150352841 BCX0"> </span><br class="SCXW150352841 BCX0" /></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">const</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0"> scoresData = highConfidenceScores.gather(nms, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">0</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">).dataSync(); </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">// Indexing scores by</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW150352841 BCX0"><span class="SCXW150352841 BCX0"> </span><br class="SCXW150352841 BCX0" /></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">const</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0"> classesData = highConfidenceClasses.gather(nms, </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">0</span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">).dataSync(); </span></span><span class="TextRun Highlight SCXW150352841 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW150352841 BCX0">// Indexing classes by NMS index</span></span><span class="EOP SCXW150352841 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-a61bead elementor-widget elementor-widget-text-editor" data-id="a61bead" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW86839594 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW86839594 BCX0">The last step is recalculating the coordinates to fit them into the shape of the original image:</span></span><span class="EOP SCXW86839594 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-84b07f0 elementor-widget elementor-widget-text-editor" data-id="84b07f0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">// Precompute the margins and factors outside the stack</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> yMarginTensor = tf.scalar(yMargin);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> xMarginTensor = tf.scalar(xMargin);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> resizeFactorTensor = tf.scalar(resizeFactor);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">// Slice the boxesData and apply transformations in one step</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> [yCoordinates, xCoordinates, height, width] = </span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">  [</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">'0'</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">, </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">'1'</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">, </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">'2'</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">, </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">'3'</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">].map((index) =&gt; </span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">    boxesData.slice([</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">0</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">, </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">parseInt</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">(index)], [</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">-1</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">, </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">1</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">]).sub(index % </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">2</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> === </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">0</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> ? yMarginTensor : xMarginTensor).mul(resizeFactorTensor)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">// Stack the tensors without converting to arrays (unless needed)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> bbox = tf.stack([yCoordinates, xCoordinates, height, width], </span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">1</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">// Convert to an array only if absolutely necessary</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW260954935 BCX0"><span class="SCXW260954935 BCX0"> </span><br class="SCXW260954935 BCX0" /></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0">const</span></span><span class="TextRun Highlight SCXW260954935 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW260954935 BCX0"> bboxArray = bbox.arraySync();</span></span><span class="EOP SCXW260954935 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-b542632 elementor-widget elementor-widget-text-editor" data-id="b542632" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW184442681 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW184442681 BCX0">At the end, we can define the </span></span><span class="TextRun SCXW184442681 BCX0" lang="PL" style="color: #008000;" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW184442681 BCX0">runInference</span></span><span class="TextRun SCXW184442681 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW184442681 BCX0"> function, which summarizes the entire process of object detection. This function will handle image preprocessing, execute the model inference, and extract the resulting bounding boxes, confidence scores, and class labels. Here’s how it looks:</span></span><span class="EOP SCXW184442681 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d6642ff elementor-widget elementor-widget-text-editor" data-id="d6642ff" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">export</span></span> <span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">async</span></span> <span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">function</span></span> <span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">runInference</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">(model, labels, image, confidenceThreshold = </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">0.4</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">  </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">try</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">// Preprocess the image</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">const</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> imgSize = model.inputs[</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">0</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">].shape[</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">1</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">];</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">const</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> { imgTensor: inputTensor, left: xMargin, top: yMargin, factor: resizeFactor } = preprocessImage(image, imgSize);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">// Run inference</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">const</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> prediction = </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">await</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> model.execute(inputTensor);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">// Post-process the model output</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">const</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> [boxes, scores, classes] = </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">await</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> postprocessInferenceResults(prediction, labels, xMargin, yMargin, resizeFactor, confidenceThreshold);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">return</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> [boxes, scores, classes];</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">  } </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">catch</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0"> (error) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">    </span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">throw</span></span> <span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">new</span></span> <span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">Error</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">(</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">`Inference failed: ${error.message}`</span></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">  }</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW217269171 BCX0"><span class="SCXW217269171 BCX0"> </span><br class="SCXW217269171 BCX0" /></span><span class="TextRun Highlight SCXW217269171 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW217269171 BCX0">}</span></span><span class="EOP SCXW217269171 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-d087385 elementor-widget elementor-widget-heading" data-id="d087385" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">6. Visualize the Results </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-84689b0 elementor-widget elementor-widget-text-editor" data-id="84689b0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW40047048 BCX0" lang="PL" xml:lang="PL" data-contrast="auto"><span class="NormalTextRun SCXW40047048 BCX0">Once we have our processed detections, it’s time to draw them on the canvas:</span></span><span class="EOP SCXW40047048 BCX0" data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d769790 elementor-widget elementor-widget-text-editor" data-id="d769790" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">function</span></span> <span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">drawBoxesOnCanvas</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">(ctx, boxes, classes, scores, colors) {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">  boxes.forEach((box, i) =&gt; {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    </span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">const</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0"> [x1, y1, x2, y2] = box;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    ctx.strokeStyle = colors[classes[i]];</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    ctx.lineWidth = </span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">2</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">;</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    ctx.strokeRect(x1, y1, x2 - x1, y2 - y1);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    ctx.fillStyle = colors[classes[i]];</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">    ctx.fillText(</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">`${labels[classes[i]]} (${</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">Math</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">.round(scores[i] * </span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">100</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">)}%)`</span></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">, x1, y1);</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">  });</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW6384134 BCX0"><span class="SCXW6384134 BCX0"> </span><br class="SCXW6384134 BCX0" /></span><span class="TextRun Highlight SCXW6384134 BCX0" lang="PL" xml:lang="PL" data-contrast="none"><span class="NormalTextRun SCXW6384134 BCX0">}</span></span><span class="EOP SCXW6384134 BCX0" data-ccp-props="{&quot;134245417&quot;:false,&quot;201341983&quot;:0,&quot;335559740&quot;:360,&quot;335572071&quot;:0,&quot;335572072&quot;:0,&quot;335572073&quot;:0,&quot;335572075&quot;:0,&quot;335572076&quot;:0,&quot;335572077&quot;:0,&quot;335572079&quot;:0,&quot;335572080&quot;:0,&quot;335572081&quot;:0,&quot;335572083&quot;:0,&quot;335572084&quot;:0,&quot;335572085&quot;:0,&quot;335572087&quot;:0,&quot;335572088&quot;:0,&quot;335572089&quot;:0,&quot;469789798&quot;:&quot;nil&quot;,&quot;469789802&quot;:&quot;nil&quot;,&quot;469789806&quot;:&quot;nil&quot;,&quot;469789810&quot;:&quot;nil&quot;,&quot;469789814&quot;:&quot;nil&quot;}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-0a7536b elementor-widget elementor-widget-text-editor" data-id="0a7536b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Running a YOLO model for object detection directly in the browser using TensorFlow.js opens up new possibilities for real-time applications. This guide covered everything from setting up TensorFlow.js to loading models, preprocessing images, running inference, and visualizing results. As you continue to explore this exciting technology, consider experimenting with different models, optimization techniques, and use cases to fully leverage the power of machine learning in web applications.</span><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p><p> </p><p><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559740&quot;:360}"> </span></p>						</div>
				</div>
		<div class="elementor-element elementor-element-b23a8f8 e-grid e-con-full e-con e-child" data-id="b23a8f8" data-element_type="container">
				<div class="elementor-element elementor-element-2e189c2 elementor-widget elementor-widget-heading" data-id="2e189c2" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Feel free to reach out if you have any questions or want to share your implementations! </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-a749555 elementor-button-success elementor-align-center elementor-widget elementor-widget-button" data-id="a749555" data-element_type="widget" data-widget_type="button.default">
				<div class="elementor-widget-container">
							<div class="elementor-button-wrapper">
					<a class="elementor-button elementor-button-link elementor-size-sm" href="https://inero-software.com/contact-us/">
						<span class="elementor-button-content-wrapper">
						<span class="elementor-button-icon">
				<svg aria-hidden="true" class="e-font-icon-svg e-fas-envelope" viewBox="0 0 512 512" xmlns="http://www.w3.org/2000/svg"><path d="M502.3 190.8c3.9-3.1 9.7-.2 9.7 4.7V400c0 26.5-21.5 48-48 48H48c-26.5 0-48-21.5-48-48V195.6c0-5 5.7-7.8 9.7-4.7 22.4 17.4 52.1 39.5 154.1 113.6 21.1 15.4 56.7 47.8 92.2 47.6 35.7.3 72-32.8 92.3-47.6 102-74.1 131.6-96.3 154-113.7zM256 320c23.2.4 56.6-29.2 73.4-41.4 132.7-96.3 142.8-104.7 173.4-128.7 5.8-4.5 9.2-11.5 9.2-18.9v-19c0-26.5-21.5-48-48-48H48C21.5 64 0 85.5 0 112v19c0 7.4 3.4 14.3 9.2 18.9 30.6 23.9 40.7 32.4 173.4 128.7 16.8 12.2 50.2 41.8 73.4 41.4z"></path></svg>			</span>
									<span class="elementor-button-text">CONTACT</span>
					</span>
					</a>
				</div>
						</div>
				</div>
				</div>
				</div>
		<div class="elementor-element elementor-element-38cdd54 e-con-full e-flex e-con e-child" data-id="38cdd54" data-element_type="container">
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/running-ai-in-client-side-real-time-face-detection-in-the-browser-using-yolo-and-tensorflow-js-use-case-study/">Running AI in client-side: Real-Time Face Detection in the Browser Using YOLO and TensorFlow.js &#8211; use case study</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6202</post-id>	</item>
	</channel>
</rss>
