<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>AI development - Inero Software - Software Consulting</title>
	<atom:link href="https://inero-software.com/tag/ai-development/feed/" rel="self" type="application/rss+xml" />
	<link>https://inero-software.com/tag/ai-development/</link>
	<description>We unleash innovations using cutting-edge technologies, modern design and AI</description>
	<lastBuildDate>Fri, 16 May 2025 09:27:59 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>

<image>
	<url>https://inero-software.com/wp-content/uploads/2018/11/inero-logo-favicon.png</url>
	<title>AI development - Inero Software - Software Consulting</title>
	<link>https://inero-software.com/tag/ai-development/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">153509928</site>	<item>
		<title>LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</title>
		<link>https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Wed, 14 May 2025 06:44:35 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[BusinessProcessesOptimization]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[cost]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7981</guid>

					<description><![CDATA[<p>In this post we discuss the types of costs associated with using dedicated LLMs and present example calculations for popular models (such as GPT-4, Claude, Mistral, LLaMA, etc.), including business use case scenarios.</p>
<p>Artykuł <a href="https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/">LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7981" class="elementor elementor-7981" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-b624393 e-flex e-con-boxed e-con e-parent" data-id="b624393" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-93f3c2f elementor-widget elementor-widget-html" data-id="93f3c2f" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-3d9c5ec elementor-widget elementor-widget-text-editor" data-id="3d9c5ec" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>When considering the introduction of artificial intelligence into your company, it’s important to understand the costs involved in implementing and maintaining your own LLM. Expenses go beyond just paying for model usage (e.g., token-based API fees) and include a range of factors — from infrastructure to security. Below, we discuss the types of costs associated with using dedicated LLMs and present example calculations for popular models (such as GPT-4, Claude, Mistral, LLaMA, etc.), including business use case scenarios.</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-085701f elementor-widget elementor-widget-text-editor" data-id="085701f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>More and more companies are considering the use of large language models (LLMs) in their own products and processes. These “dedicated” models can act as intelligent assistants—answering customer questions, analyzing documents, generating reports, and much more. <a href="https://inero-software.com/chatbot-agent-or-ai-assistant-find-out-which-solution-is-best-for-your-business/">You can read more about it here.</a></p><p><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4636eb2 elementor-widget elementor-widget-heading" data-id="4636eb2" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Types of Costs When Using LLMs</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-dc7b85d elementor-widget elementor-widget-text-editor" data-id="dc7b85d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Before starting the implementation, it&#8217;s important to understand all the components that contribute to the total cost of using a dedicated model.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d01d87f elementor-widget elementor-widget-heading" data-id="d01d87f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Infrastructure:
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-556fadf elementor-widget elementor-widget-text-editor" data-id="556fadf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you&#8217;re using models via a cloud API (OpenAI, Anthropic, Google), </strong>you only pay for the tokens used. The infrastructure cost is &#8220;hidden&#8221; on the provider&#8217;s side.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-fca6d2f elementor-widget elementor-widget-text-editor" data-id="fca6d2f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you choose to self-host a model such as Mistral or LLaMA, </strong>you’ll need to maintain a GPU server—either locally or in the cloud. For example, renting an instance with an A100 GPU typically costs $1–2 per hour, which amounts to $750–1,500 per month if the server runs continuously. While such an investment can handle a high volume of queries, it may be underutilized at a smaller scale.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-6ef6f58 elementor-widget elementor-widget-heading" data-id="6ef6f58" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Licensing and Model Fees
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-275e876 elementor-widget elementor-widget-text-editor" data-id="275e876" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Commercial models come with licensing or subscription fees. For example, when using the GPT-4 API from OpenAI or Claude from Anthropic,<strong> you pay per token used</strong> according to the provider&#8217;s pricing (we outline token costs in detail later on). On the other hand, open-source models like LLaMA or Mistral are available for free—<strong>there are no licensing or token fees</strong>. Meta, for instance, released LLaMA 2 under a license that allows businesses to use it freely. However, “free” doesn’t mean zero cost—you’ll still pay for the infrastructure and electricity needed to run the model (as mentioned earlier). It’s also important to check license restrictions: some open models may have specific usage conditions (e.g., restrictions on certain industries).</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aa18bfc elementor-widget elementor-widget-heading" data-id="aa18bfc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Model Adaptation and Customization
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-96aa203 elementor-widget elementor-widget-text-editor" data-id="96aa203" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>For an LLM to perform well in a specific company setting, it often requires customization—such as additional training (fine-tuning) on company-specific data or at least the preparation of tailored prompts (known as prompt engineering). This adaptation process can generate significant costs:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-8573d17 elementor-widget elementor-widget-text-editor" data-id="8573d17" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Model Fine-Tuning:</strong> Training a model on your own dataset requires computing power (typically GPUs running for many hours) and expert knowledge. For larger models, this can cost anywhere from several thousand to tens of thousands of dollars—factoring in both infrastructure expenses and specialist time. Even fine-tuning a smaller model (e.g., GPT-3.5) via OpenAI’s API can incur significant costs, as it involves processing hundreds of thousands or even millions of tokens during training—billed according to the provider’s token pricing.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-092f2e3 elementor-widget elementor-widget-text-editor" data-id="092f2e3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Prompt Engineering:</strong> As an alternative or complement to training, you can craft tailored prompts and instructions for the model. While writing prompts itself doesn’t require paid resources, iteratively testing and refining multiple versions consumes tokens (which adds cost when using a cloud-based model) and takes up team time. This can be viewed as either an operational cost or a competence-related expense—specialist time is needed to optimize the model’s behavior for your specific use case.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-b4d3407 elementor-widget elementor-widget-heading" data-id="b4d3407" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Operational Costs
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-d96252c elementor-widget elementor-widget-text-editor" data-id="d96252c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>After deploying the model, ongoing operational costs come into play. These include monitoring the model’s performance, maintaining efficiency, logging results, applying updates, and fixing potential issues. If you&#8217;re using an API, the main operational <strong>cost</strong> <strong>will be the monthly bill for consumed tokens,</strong> along with any premium subscription fees (some providers offer subscription plans with usage limits or preferred pricing). If the model is hosted locally, operational costs typically include:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-15a5e0f elementor-widget elementor-widget-text-editor" data-id="15a5e0f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Electricity consumption</strong> – GPU-based models can consume significant amounts of power, leading to substantial monthly energy costs.</p></li><li><p><strong>System administration</strong> – Time spent by administrators on server maintenance, backups, and updating software components (e.g., AI libraries).</p></li><li><p><strong>Infrastructure scaling</strong> – As demand grows, additional machines or cloud instances may be needed, resulting in further expenses.</p></li><li><p><strong>High availability</strong> – If the LLM assistant needs to operate 24/7 without downtime, you may need to invest in redundant resources (e.g., backup servers) or enter into an SLA agreement with your cloud provider.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-62dc195 elementor-widget elementor-widget-heading" data-id="62dc195" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Team Expertise
</h4>		</div>
				</div>
				<div class="elementor-element elementor-element-3d2c4a9 elementor-widget elementor-widget-text-editor" data-id="3d2c4a9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Implementing an LLM requires the right expertise within the IT/Data team. If your company lacks AI experience, it may be necessary to train existing employees or hire new specialists—such as an ML engineer or MLOps expert—which adds recruitment or training costs. Alternatively, some companies choose to work with external consultants or service providers to deploy the model. This also incurs costs, usually one-time project fees, which can be significant. It&#8217;s also important to account for the time your team spends integrating the model with existing systems (e.g., connecting it to a database or user-facing application). This is a labor cost that’s often overlooked in smaller projects but can have a major impact in practice.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-980dd92 elementor-widget elementor-widget-text-editor" data-id="980dd92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The categories above show that the total cost of owning a dedicated LLM-based solution goes far beyond just the fee for accessing the model. It&#8217;s important to consider all these factors before making a decision. In the next section, we’ll look at specific numbers: how much a single prompt costs for various popular models, and what it would take to maintain a simple LLM assistant in two example business scenarios.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aa5ede7 elementor-widget elementor-widget-spacer" data-id="aa5ede7" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0acc8bb elementor-widget elementor-widget-heading" data-id="0acc8bb" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Cost of a Single Prompt in Popular LLM Models
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-37ada92 elementor-widget elementor-widget-text-editor" data-id="37ada92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Language models are typically billed based on the number of tokens processed. A token is a small piece of text—it may represent a single word or part of a word (for example, 1,000 tokens roughly equals 750 words of continuous text). API providers list prices per 1,000 or 1 million tokens.</p><p>Below is a comparison of the approximate cost to process 1,000 tokens using selected popular LLM models:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-94811ff elementor-widget elementor-widget-html" data-id="94811ff" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>LLM Model Comparison</title>
  <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
  <style>
    body {
      font-family: 'Roboto', sans-serif;
      font-weight: 300;
      font-size: 14px;
      color: #1C244B;
    }
    table {
      width: 100%;
      border-collapse: collapse;
    }
    th, td {
      border: 1px solid #ccc;
      padding: 8px;
      vertical-align: top;
    }
    th {
      background-color: #f2f2f2;
    }
    td ul {
      margin: 0;
      padding-left: 18px;
    }
  </style>
</head>
<body>

<table>
  <thead>
    <tr>
      <th>LLM Model</th>
      <th>Access / License</th>
      <th>Cost per 1000 tokens</th>
      <th>Notes</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GPT-3.5 Turbo (OpenAI)</td>
      <td>Cloud API (chat model available, e.g., in ChatGPT)</td>
      <td>$0.0015 (input)<br>$0.0020 (output)</td>
      <td>
        <ul>
          <li>Very low cost – 16k tokens + paid upgrade to 128k</li>
          <li>Good response quality</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 (8k)</td>
      <td>Cloud API (OpenAI)</td>
      <td>$0.08 (input)<br>$0.16 (output)</td>
      <td>High quality; high cost</td>
    </tr>
    <tr>
      <td>GPT-4 Turbo (128k)</td>
      <td>Cloud API (OpenAI)</td>
      <td>$0.01 (input)<br>$0.03 (output)</td>
      <td>
        <ul>
          <li>Reliable large context (up to 128k tokens)</li>
          <li>Cheaper (only slightly more than GPT-3.5)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude Instant v1.2</td>
      <td>Cloud API (Anthropic)</td>
      <td>$0.0008 (input)<br>$0.0024 (output)</td>
      <td>
        <ul>
          <li>Fast, lower-cost Claude model (equivalent to GPT-3.5)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude 2 (100k)</td>
      <td>Cloud API (Anthropic)</td>
      <td>$0.008 (input)<br>$0.024 (output)</td>
      <td>
        <ul>
          <li>High-quality model by Anthropic; context up to 100k tokens</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Mistral 7B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Requires self-hosting</li>
          <li>Alternative to GPT-3.5 – low hardware requirements (can run with <1M tokens)</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 13B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Self-hosting required</li>
          <li>Needs stronger hardware (e.g., 2× 24GB GPU) than 7B, but still accessible for many companies</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 70B</td>
      <td>Open source (free model)</td>
      <td>Token cost: $0</td>
      <td>
        <ul>
          <li>Requires self-hosting</li>
          <li>Requires expensive infrastructure (e.g., 8× 80GB GPUs)</li>
          <li>At this scale, costs may match or even exceed GPT-4</li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>

</body>
</html>
		</div>
				</div>
				<div class="elementor-element elementor-element-6267324 elementor-widget elementor-widget-text-editor" data-id="6267324" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p class="" data-start="67" data-end="109"><strong data-start="67" data-end="109">Legend: How Token Costs Are Calculated</strong></p><ul><li style="list-style-type: none;"><ul data-start="111" data-end="248"><li class="" data-start="111" data-end="171"><p class="" data-start="113" data-end="171"><strong data-start="113" data-end="129">Input tokens</strong> – words contained in the user&#8217;s prompt.</p></li><li class="" data-start="172" data-end="248"><p class="" data-start="174" data-end="248"><strong data-start="174" data-end="191">Output tokens</strong> – words generated by the model in response (completion).</p></li></ul></li></ul><p class="" data-start="250" data-end="353">For most commercial providers, the cost is charged separately for input and output tokens. For example:</p><p class="" data-start="355" data-end="371"><strong data-start="355" data-end="371">GPT-4 Turbo:</strong></p><ul><li style="list-style-type: none;"><ul data-start="373" data-end="439"><li class="" data-start="373" data-end="406"><p class="" data-start="375" data-end="406">1,000 input tokens: <strong data-start="395" data-end="404">$0.03</strong></p></li><li class="" data-start="407" data-end="439"><p class="" data-start="409" data-end="439">1,000 output tokens: <strong data-start="430" data-end="439">$0.06</strong></p></li></ul></li></ul><p class="" data-start="441" data-end="557">If a dialogue contains a total of 1,000 tokens (e.g., 500 input + 500 output), the cost is approximately <strong data-start="546" data-end="556">$0.045</strong>.</p><p class="" data-start="559" data-end="652">For simplicity, you can assume that a full interaction of 1,000 tokens costs about <strong data-start="642" data-end="651">$0.09</strong>.</p><p class="" data-start="654" data-end="672"><strong data-start="654" data-end="672">By comparison:</strong></p><ul><li style="list-style-type: none;"><ul data-start="674" data-end="969" data-is-last-node="" data-is-only-node=""><li class="" data-start="674" data-end="777"><p class="" data-start="676" data-end="777"><strong data-start="676" data-end="693">GPT-3.5 Turbo</strong> – a similar 1,000-token dialogue costs only about <strong data-start="744" data-end="755">$0.0035</strong> (i.e., 0.35 cents).</p></li><li class="" data-start="778" data-end="969"><p class="" data-start="780" data-end="969"><strong data-start="780" data-end="802">Open-source models</strong> (e.g., Mistral, LLaMA) – token costs are <strong data-start="844" data-end="850">$0</strong>, since the models run locally. You only pay for infrastructure-related costs (power consumption, server uptime, etc.).</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-2c3b4b9 elementor-widget elementor-widget-text-editor" data-id="2c3b4b9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Open-source models (such as Mistral, LLaMA, etc.) are attractive because they come with no fees for the model itself—you can generate any number of tokens without paying the model provider a cent. However, to run these models, you need to maintain your own infrastructure. At a small scale, the cost of renting a machine for a single query may actually exceed the cost of an individual API call to a model like GPT. On the other hand, at a large scale—with many queries per day—open-source solutions can become significantly more cost-effective. In summary, cost-effectiveness depends on the use case, which we’ll explore in the next section.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-68c5cf5 elementor-widget elementor-widget-spacer" data-id="68c5cf5" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-eb32f74 elementor-widget elementor-widget-heading" data-id="eb32f74" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Example Costs of Implementing an LLM Assistant (100 Queries per Day)
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-d65244a elementor-widget elementor-widget-text-editor" data-id="d65244a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Let’s now consider a practical scenario: your company wants to implement a simple LLM-based virtual assistant that performs one of the following tasks:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-54a353d elementor-widget elementor-widget-text-editor" data-id="54a353d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>Document analysis</strong> – e.g., the assistant reads offers or contracts and extracts key information such as clauses, deadlines, and amounts.</p></li><li><p><strong>Customer inquiry handling</strong> – e.g., the assistant replies to customer emails with questions about pricing, product availability, technical support, etc.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-e25102c elementor-widget elementor-widget-text-editor" data-id="e25102c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Let’s assume that:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e1312ca elementor-widget elementor-widget-text-editor" data-id="e1312ca" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p>The assistant will handle approximately <strong>100 interactions per day</strong>.</p></li><li><p>Each interaction consists of a <strong>prompt and a response</strong>, totaling around <strong>2,000 tokens</strong> (e.g., 1,000 tokens in the prompt—roughly 750 words or several paragraphs—and 1,000 tokens in the response, or about 750 generated words). This token size covers fairly complex queries and detailed replies.</p></li><li><p>On a monthly basis, the assistant will process around <strong>6 million tokens</strong> (3,000 interactions × 2,000 tokens = 6,000,000 tokens).</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-fd1201f elementor-widget elementor-widget-text-editor" data-id="fd1201f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>We want to compare the <strong>monthly operating costs</strong> of such an assistant depending on the choice of model and deployment approach. We&#8217;ll present two variants:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-405f91b elementor-widget elementor-widget-text-editor" data-id="405f91b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p><strong>API Variant (Closed Model):</strong> We use a commercial model via an API (e.g., OpenAI GPT or Anthropic Claude). We don’t maintain our own servers—costs are limited to token usage, billed according to the provider’s pricing.</p></li><li><p><strong>Self-Hosted Variant (Open-Source Model):</strong> We use an open-source model (e.g., Mistral or LLaMA) deployed on our own servers. Costs include infrastructure needed to support approximately 100 queries per day—such as cloud GPU instance rental or hardware amortization, plus electricity.</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-0c96b1a elementor-widget elementor-widget-text-editor" data-id="0c96b1a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Below is a table comparing <strong>estimated monthly costs</strong> for several example models under both deployment variants, assuming <strong>6 million tokens per month</strong>:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-7d37b9a elementor-widget elementor-widget-html" data-id="7d37b9a" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>Monthly LLM Cost Comparison</title>
  <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap" rel="stylesheet">
  <style>
    body {
      font-family: 'Roboto', sans-serif;
      font-weight: 300;
      font-size: 14px;
      color: #1C244B;
    }
    table {
      width: 100%;
      border-collapse: collapse;
      margin-top: 20px;
    }
    th, td {
      border: 1px solid #ccc;
      padding: 8px;
      vertical-align: top;
    }
    th {
      background-color: #f2f2f2;
    }
    td ul {
      margin: 0;
      padding-left: 18px;
    }
  </style>
</head>
<body>

<table>
  <thead>
    <tr>
      <th>Model (variant)</th>
      <th>Estimated Monthly Cost</th>
      <th>Comment</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>GPT-3.5 Turbo (API)</td>
      <td>approx. $18 (USD)</td>
      <td>
        <ul>
          <li>Very low cost for this quality level.</li>
          <li>Estimate: approx. $0.0027/1k tokens → $12 for generating 4M tokens + $6 for prompts → ~$18/month total.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 (8k) (API)</td>
      <td>approx. $270</td>
      <td>
        <ul>
          <li>Much higher cost for better quality.</li>
          <li>Example: 8M tokens → cost: 8M × $0.08/1k (input) + $0.16/1k (output) → $270–$540 monthly.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>GPT-4 Turbo (128k) (API)</td>
      <td>approx. $18</td>
      <td>
        <ul>
          <li>Slightly more expensive than GPT-3.5 due to cheaper input/output token pricing.</li>
          <li>May even deliver better quality than GPT-4 (8k).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude Instant (API)</td>
      <td>approx. $20–25</td>
      <td>
        <ul>
          <li>Comparable to GPT-3.5 in cost.</li>
          <li>Estimate: approx. $0.0021/1k tokens (input+output) → ~$18–25 for 8M tokens (plus potential flat fees).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Claude 2 (API)</td>
      <td>approx. $150–200</td>
      <td>
        <ul>
          <li>Cheaper than GPT-4, but still several times more expensive than GPT-3.5.</li>
          <li>Estimate: $0.032/1k tokens → ~$192 for 8M tokens.</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Mistral 7B (open source, self-hosted, 1x GPU)</td>
      <td>approx. $300</td>
      <td>
        <ul>
          <li>Cost mainly for maintaining server/GPU.</li>
          <li>Assumption: 1x 24GB GPU instance – model generates ~30–60 tokens/sec, power usage 100–150W.</li>
          <li>Actual cost depends on location and usage (electricity + server = ~$300–400/month).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>LLaMA 2 70B (open source, self-hosted, multi-GPU)</td>
      <td>approx. $1,000+</td>
      <td>
        <ul>
          <li>High cost due to powerful GPU requirements.</li>
          <li>Typically requires at least 8×80GB GPUs (~$10k–12k hardware + high power consumption).</li>
          <li>Costs vary based on setup model (on-prem / cloud / GPU provider).</li>
        </ul>
      </td>
    </tr>
    <tr>
      <td>Local model (e.g., LLaMA 13B, GPTQ, Mistral 7B – CPU)</td>
      <td>approx. $300–500</td>
      <td>
        <ul>
          <li>Cost includes operation of local server.</li>
          <li>May be slower than GPT-3.5, but offers more privacy and control.</li>
          <li>For CPU instance (e.g., 12 cores, 64 GB RAM), monthly cost is mainly for electricity and maintenance.</li>
        </ul>
      </td>
    </tr>
  </tbody>
</table>

</body>
</html>
		</div>
				</div>
				<div class="elementor-element elementor-element-c433e92 elementor-widget elementor-widget-text-editor" data-id="c433e92" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>From the above comparison, several key takeaways can be drawn:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-cdd2a41 elementor-widget elementor-widget-text-editor" data-id="cdd2a41" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Small-scale usage (100 queries/day) favors API solutions</strong></p><p>With relatively low query volume, using a commercial API (OpenAI, Anthropic) is highly cost-effective—especially with lower-priced models like GPT-3.5 or Claude Instant, where monthly costs can be as low as a few dozen dollars. For higher-end models, monthly costs may rise to several hundred dollars. Still, at this scale, running your own GPU server at $300+ per month would be less economical than relying on cloud-based APIs.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e8cf4e9 elementor-widget elementor-widget-text-editor" data-id="e8cf4e9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Large-scale usage (thousands of queries) changes the equation</strong></p><p>If your assistant becomes successful and the number of queries increases by 10x or even 100x, the monthly API bill could grow to thousands or even tens of thousands of dollars. In such cases, investing in an open-source, self-hosted model starts to make financial sense.  With a high enough query volume, the <strong>per-request cost</strong> of running the model locally becomes lower than the API cost—since the purchased or rented hardware is being used more efficiently. In extreme cases of massive scale, some organizations may even consider training their own model from scratch—but this is typically reserved for the largest players with very substantial budgets.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-8d36cb0 elementor-widget elementor-widget-text-editor" data-id="8d36cb0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Use Case Matters (Quality vs. Cost Efficiency)</strong></p><p>Choosing the right model shouldn&#8217;t be based solely on cost—it also depends on the quality of output required for your use case. In a <strong>document analysis</strong> scenario, precision in extracting information is the top priority. A lower-cost or open-source model may be sufficient here, especially if fine-tuned to the task. A model with 7B–13B parameters can offer adequate performance at a much lower cost. Moreover, when processing <strong>sensitive documents</strong> (e.g., contracts), running the model locally ensures that the content never leaves your organization—an invaluable benefit from a legal and data privacy standpoint. On the other hand, in <strong>customer inquiry handling</strong>, where natural language quality, politeness, and contextual understanding are critical, <strong>GPT-4</strong> can significantly outperform smaller models. In this case, a company may find it worthwhile to pay more for superior customer experience.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e71a8c1 elementor-widget elementor-widget-text-editor" data-id="e71a8c1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Hidden Costs Around the Project</strong></p><p>It&#8217;s important to note that the above calculations cover only the <strong>technical costs</strong>—such as token usage or infrastructure. In practice, there are also <strong>&#8220;soft&#8221; costs</strong> to consider, including staff time for preparing the implementation, integrating the model with systems like a CRM or knowledge base, testing, and ongoing iterations and improvements. For example, if the assistant needs to retrieve data from a company&#8217;s internal document repository, those documents often need to be <strong>organized or cleaned</strong> before they can be effectively used by the model.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-a572344 elementor-widget elementor-widget-spacer" data-id="a572344" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-2a1f46d elementor-widget elementor-widget-heading" data-id="2a1f46d" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Cost Example: AI Assistant for Analyzing Emails and PDF Documents
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f3e96de elementor-widget elementor-widget-text-editor" data-id="f3e96de" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Here we also present the cost breakdown of our assistant based on Google&#8217;s Gemini model, which we described [<a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/">here</a>]. Its task is to automatically analyze incoming emails to identify insurance policies and extract key data from attached PDF documents—such as policy number, insured party address, or payment confirmation.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-149557e elementor-widget elementor-widget-text-editor" data-id="149557e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Average Token Count per Email:</strong></p><ul><li style="list-style-type: none;"><ul><li><p><strong>Input:</strong> 3,500 tokens</p></li><li><p><strong>Output:</strong> 220 tokens</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-6ac8e71 elementor-widget elementor-widget-text-editor" data-id="6ac8e71" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Analyzing 100 emails with attachments using the <strong>Gemini 2.0 Flash</strong> model costs approximately <strong>$1.50</strong>.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-6721885 elementor-widget elementor-widget-heading" data-id="6721885" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Summary</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-2655d3c elementor-widget elementor-widget-text-editor" data-id="2655d3c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>Can We Afford Our Own “ChatGPT” in the Company? </strong>As we&#8217;ve seen, the answer is: <strong>it depends</strong>—primarily on the scale of usage and quality requirements. The key lies in selecting a model and deployment method that aligns with your specific needs. An <strong>iterative approach</strong> is often the most practical: start with a lower-cost model or API, evaluate the results, and scale up to a more powerful model or self-hosted solution as the project matures. Regardless of the path you choose, <strong>careful planning and cost monitoring</strong> across all categories is essential. We hope this comparison helps you make informed decisions and prepare a realistic budget for implementing a dedicated LLM in your organization.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-ec198b5 elementor-widget elementor-widget-text-editor" data-id="ec198b5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>If you&#8217;re considering implementing an assistant in your company, it&#8217;s worth finding answers to the following questions:</strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-22bdc83 elementor-widget elementor-widget-text-editor" data-id="22bdc83" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><p>Do I need high-quality responses (e.g., GPT-4), or is an approximate answer sufficient (e.g., Claude Haiku, Gemini Flash)?</p></li><li><p>Am I processing sensitive data (e.g., customer documents)?</p></li><li><p>Do I have an IT team capable of hosting a model in-house?</p></li><li><p>What is the expected number of queries per day/month?</p></li><li><p>Is it more cost-effective to maintain my own infrastructure, or should I pay for API access?</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-f145f07 elementor-widget elementor-widget-text-editor" data-id="f145f07" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>For small to medium-scale applications, the cost of using a dedicated LLM can be quite reasonable. Thanks to cloud-based services, it’s possible to get started for just a few dozen dollars per month with models like GPT-3.5 or Claude Instant—an excellent option for experimentation and early prototypes. If you need top-tier performance, such as what GPT-4 offers, you&#8217;ll need to account for higher costs. However, even a few hundred dollars per month can be justified if the business value is significant—for example, by automating tasks that would otherwise require many hours of manual work.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b80a60d elementor-widget elementor-widget-text-editor" data-id="b80a60d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>On the other hand, for large companies planning intensive AI use, costs can grow exponentially—making it worth considering open-source options and greater investment in in-house infrastructure. Open models like LLaMA or Mistral offer freedom from per-token fees, but shift the cost burden to hardware and staffing. They become cost-effective when operating at scale or when <strong>full control over data</strong> is a top priority.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-65aa533 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="65aa533" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/contact-us/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/02/cta-AI2-1030x579.png);" role="img" aria-label="cta AI2"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Looking to Bring AI Tools into Your Company?					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We offer comprehensive technology support in the field of artificial intelligence and AI agents.
Tell us about your idea!
					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Contact Us					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/llm-implementation-and-maintenance-costs-for-businesses-a-detailed-breakdown/">LLM Implementation and Maintenance Costs for Businesses: A Detailed Breakdown</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7981</post-id>	</item>
		<item>
		<title>Chatbot, Agent or AI Assistant? Find Out Which Solution Is Best for Your Business</title>
		<link>https://inero-software.com/chatbot-agent-or-ai-assistant-find-out-which-solution-is-best-for-your-business/</link>
		
		<dc:creator><![CDATA[Marta Kuprasz]]></dc:creator>
		<pubDate>Thu, 08 May 2025 08:57:21 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[BusinessProcessesOptimization]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7947</guid>

					<description><![CDATA[<p>Artificial intelligence and Large Language Models are buzzwords heard in nearly every industry. Many companies are wondering how to use them safely and which solution will be the most effective. There are plenty of options—and they’re often hard to tell apart. In this article, we break them down in a clear and easy-to-understand way.</p>
<p>Artykuł <a href="https://inero-software.com/chatbot-agent-or-ai-assistant-find-out-which-solution-is-best-for-your-business/">Chatbot, Agent or AI Assistant? Find Out Which Solution Is Best for Your Business</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7947" class="elementor elementor-7947" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-c1eecc3 e-flex e-con-boxed e-con e-parent" data-id="c1eecc3" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-a23440b elementor-widget elementor-widget-html" data-id="a23440b" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-5054636 elementor-widget elementor-widget-text-editor" data-id="5054636" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>Artificial intelligence and Large Language Models are buzzwords heard in nearly every industry. Many companies are wondering how to use them safely and which solution will be the most effective. There are plenty of options—and they’re often hard to tell apart. In this article, we break them down in a clear and easy-to-understand way.</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-ef953eb elementor-widget elementor-widget-text-editor" data-id="ef953eb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>AI can take on many roles in a company—as a chatbot, assistant, agent, data analysis tool, content generator, or knowledge search engine. So how can you choose the solution that best fits your employees’ needs? It helps to understand what each option has to offer.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-3d8a982 elementor-widget elementor-widget-heading" data-id="3d8a982" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Chatbot – answers questions, provides explanations, and handles requests
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-9aafe69 elementor-widget elementor-widget-text-editor" data-id="9aafe69" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>This is the most common use of AI in areas such as customer service and sales. An AI chatbot based on a large language model, such as ChatGPT, can hold natural conversations, understand the context of inquiries, and deliver accurate answers—24/7, in multiple languages, and without human involvement.</p><p> </p><p>These solutions are typically implemented on websites, in messaging platforms (like Messenger or WhatsApp), or within helpdesk systems, where they assist with answering questions, tracking orders, or providing product information. As a result, they significantly automate customer service, reduce operational costs, and improve customer satisfaction ratings.</p><p> </p><p>For the purposes of this article, we define a chatbot as an AI interface primarily intended for external users—in other words, it operates “outside the company.” This definition distinguishes it from AI agents, which perform more complex tasks within internal processes by integrating with systems, databases, or APIs.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-67e3688 elementor-widget elementor-widget-image" data-id="67e3688" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img fetchpriority="high" decoding="async" width="1030" height="408" src="https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-1030x408.png" class="attachment-large size-large wp-image-7936" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-1030x408.png 1030w, https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-300x119.png 300w, https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-768x304.png 768w, https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-1536x609.png 1536w, https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-757x300.png 757w, https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226.png 1832w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7936" data-permalink="https://inero-software.com/pl/chatbot-agent-czy-asystent-ai-sprawdz-ktore-rozwiazanie-najlepiej-sprawdzi-sie-w-twoim-biznesie/zrzut-ekranu-2025-05-06-122226/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226.png" data-orig-size="1832,726" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Zrzut ekranu 2025-05-06 122226" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-300x119.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/05/Zrzut-ekranu-2025-05-06-122226-1030x408.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-fb8d9b1 elementor-widget elementor-widget-text-editor" data-id="fb8d9b1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><a href="https://www.incone60.eu/seastat">https://www.incone60.eu/seastat</a></p><p> </p>						</div>
				</div>
				<div class="elementor-element elementor-element-5fb00e5 elementor-widget elementor-widget-spacer" data-id="5fb00e5" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-70e37da elementor-widget elementor-widget-heading" data-id="70e37da" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">AI Agent – a tool designed to carry out specific tasks</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-ae7ea10 elementor-widget elementor-widget-text-editor" data-id="ae7ea10" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p class="" data-start="0" data-end="327">Unlike a chatbot, which interacts with external users, an AI agent operates within the organization and supports employees by automating specific business processes. It’s not a one-size-fits-all tool—it’s built with a clearly defined purpose in mind, such as document processing, data analysis, or integration with ERP systems.</p><p data-start="0" data-end="327"> </p><p class="" data-start="329" data-end="590">Thanks to large language models like Gemini or Claude, an AI agent can understand context, make decisions, and trigger specific actions—without human input. It can run in the background, process data from multiple sources, manage files, or handle email inboxes. Each AI agent is tailored to the company’s individual needs and specific tasks. Only then can it offer real value instead of becoming just another generic tool.</p><p class="" data-start="754" data-end="930">Want to see how this works in practice?</p><p class="" data-start="754" data-end="930"><br data-start="793" data-end="796" />Check out our case study:<a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/"> Meet your personal AI agent-a case study for a freight forwarding company</a> – where we describe how we built an agent integrated with an email inbox.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-272f47a elementor-widget__width-initial elementor-widget elementor-widget-video" data-id="272f47a" data-element_type="widget" data-settings="{&quot;youtube_url&quot;:&quot;https:\/\/youtu.be\/B4VxxjWYzDM&quot;,&quot;video_type&quot;:&quot;youtube&quot;,&quot;controls&quot;:&quot;yes&quot;}" data-widget_type="video.default">
				<div class="elementor-widget-container">
					<div class="elementor-wrapper elementor-open-inline">
			<div class="elementor-video"></div>		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-a535da1 elementor-widget elementor-widget-spacer" data-id="a535da1" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-a575c44 elementor-widget elementor-widget-heading" data-id="a575c44" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">AI Assistant – supports users in daily work by operating contextually and “in the background”</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-ddbd088 elementor-widget elementor-widget-text-editor" data-id="ddbd088" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Unlike a chatbot that answers questions or an agent that automates a specific process, an AI assistant is a tool that works alongside employees in real time—it understands context, suggests next steps, and makes tasks easier within familiar applications.</p><p> </p><p>It’s typically integrated into a specific work environment, such as a word processor, spreadsheet, CRM, or project management tool. The assistant doesn’t replace the user—it actively supports them in making decisions, writing, analyzing data, or planning.</p><p> </p><p>AI assistants like GitHub Copilot, Notion AI, or Google’s Workspace assistant show how this technology can genuinely boost team productivity and reduce time spent on routine tasks. From a business perspective, a well-designed assistant can improve work quality, reduce errors, and make onboarding new employees easier.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-ebf3d14 elementor-widget elementor-widget-spacer" data-id="ebf3d14" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-823b953 elementor-widget elementor-widget-heading" data-id="823b953" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Other Business Applications of Large Language Models</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-767d863 elementor-widget elementor-widget-text-editor" data-id="767d863" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The possibilities go far beyond chatbots, assistants, or agents. These models can take on specialized roles, supporting tasks such as document processing, data analysis, or content creation. They’re increasingly used to automatically summarize reports, extract information from unstructured sources (like emails, PDFs, or scanned forms), or answer natural-language questions based on internal documentation.</p><p> </p><p>LLMs can also assist marketing teams by generating suggestions for ad copy, product descriptions, or sales messages tailored to the company’s style. In analytics departments, they provide faster access to data—generating database queries, interpreting results, and presenting insights in a way that’s easy for non-technical users to understand. These applications often don’t require building a new tool from scratch, but rather integrating the AI model into existing company systems. This way, the technology supports specific tasks—right where it’s needed.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-0a9debf elementor-widget elementor-widget-spacer" data-id="0a9debf" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-3481b69 elementor-widget elementor-widget-heading" data-id="3481b69" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">AI Models and Data Security
</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-5be148c elementor-widget elementor-widget-text-editor" data-id="5be148c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Business owners and managers still approach AI tools with caution, mainly because they’re unsure how to ensure the security and confidentiality of processed data. We’ve explored these topics in previous publications that are worth reviewing.</p><p> </p><p>In the article <em>“</em><a href="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/" rel="bookmark"><strong>AI User Privacy: An Analysis of Platform Policies</strong></a><em>”</em>, we outlined the data privacy and model training policies followed by major AI providers such as OpenAI, Google Gemini, Microsoft’s Azure OpenAI, and Anthropic’s Claude.</p><p> </p><p>For those considering an on-premise solution, we recommend the blog post <em>“</em><strong><a href="https://inero-software.com/top-lightweight-llms-for-local-deployment/" rel="bookmark">Top Lightweight LLMs for Local Deployment</a></strong><em>”</em> There, we reviewed several top open-source lightweight LLMs and explained how to run them on a local Windows machine—even with limited GPU resources.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b40d87c elementor-widget elementor-widget-text-editor" data-id="b40d87c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Choosing the right AI tool for your company depends primarily on the goal it’s meant to achieve. A chatbot works best where quick and accessible customer service is key. An AI agent can automate repetitive internal processes and improve information flow between systems. An AI assistant provides day-to-day support for employees—offering suggestions, summaries, or preparing data for further use.</p><p> </p><p>Large language models also allow integration with existing processes—without the need to build a dedicated tool from scratch. However, implementing AI-based technology requires a well-thought-out decision, taking into account both efficiency and data security. If you&#8217;re looking to adopt AI in your company and need an experienced partner to guide you through the process, get in touch with us.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d041a5c elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="d041a5c" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/contact-us/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/03/cta-1903-1030x579.png);" role="img" aria-label="cta 1903"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Bring AI into Your Business					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We provide professional consulting and end-to-end implementation of tools based on large language models.
					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Contact Us					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/chatbot-agent-or-ai-assistant-find-out-which-solution-is-best-for-your-business/">Chatbot, Agent or AI Assistant? Find Out Which Solution Is Best for Your Business</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7947</post-id>	</item>
		<item>
		<title>AI User Privacy: An Analysis of Platform Policies</title>
		<link>https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Wed, 30 Apr 2025 08:35:35 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Privacy Policies]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7890</guid>

					<description><![CDATA[<p>In this article, we’ll break down the data privacy policies of top AI platforms. You will also learn what to do to ensure your data is not used for training Large Language Models (LLM).</p>
<p>Artykuł <a href="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/">AI User Privacy: An Analysis of Platform Policies</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7890" class="elementor elementor-7890" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-bc35505 e-flex e-con-boxed e-con e-parent" data-id="bc35505" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-44ba7b6 elementor-widget elementor-widget-html" data-id="44ba7b6" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-23d3d65 elementor-widget elementor-widget-text-editor" data-id="23d3d65" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>Ever wondered where your data goes when you interact with AI cloud platforms? Or is it used to train future models? In this article, we’ll break down the data privacy policies of top AI platforms. You will also learn what to do to ensure your data is not used for training Large Language Models (LLM).</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-18af7ef elementor-widget elementor-widget-text-editor" data-id="18af7ef" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Major AI cloud providers have become increasingly transparent about their data usage policies &#8211; especially when it comes to training models. While most platforms, particularly those offering enterprise-level services, do not use your inputs and outputs for training by default, the fine print matters. Understanding how these services handle your data &#8211; and how you can maintain control &#8211; is essential.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-e8dd97a elementor-widget elementor-widget-text-editor" data-id="e8dd97a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>In this article, we’ll break down the data privacy and model training policies of top AI platforms, including OpenAI, Google Gemini, Microsoft’s Azure OpenAI and Anthropic’s Claude. You’ll learn:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-33a2f78 elementor-widget elementor-widget-text-editor" data-id="33a2f78" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li>How AI platforms use your data and whether your data is used to train models by default</li><li>How to prevent AI from using your data opt, if needed</li><li>Where your data is stored (data residency), and</li><li>What compliance measures (like GDPR) apply</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-291cb3e elementor-widget elementor-widget-text-editor" data-id="291cb3e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Adopting AI isn’t just about prompt engineering or model performance. It’s also about knowing where your data goes—and how to ensure it stays under your control.</p><p><strong>Here’s what you need to know:</strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ecb1c2a elementor-widget elementor-widget-spacer" data-id="ecb1c2a" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-2add4a6 elementor-widget elementor-widget-heading" data-id="2add4a6" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">OpenAI – Data Usage and Privacy</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-cb6fb79 elementor-widget elementor-widget-text-editor" data-id="cb6fb79" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>OpenAI treats your data differently based on how you interact with its services:</p><p><strong>ChatGPT App (Web/Mobile)</strong></p><p>When you chat with ChatGPT, your conversations may be used to train AI models &#8211; unless you manually opt out. To prevent your data from being used:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-01b1f1c elementor-widget elementor-widget-text-editor" data-id="01b1f1c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li>Go to Settings → Data Controls → Improve the model for everyone and toggle it off.</li><li>Even with the opt-out, OpenAI stores chats for 30 days for abuse monitoring before deletion.</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1fb6e1a elementor-widget elementor-widget-image" data-id="1fb6e1a" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img decoding="async" data-attachment-id="7897" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-2-2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" data-orig-size="602,407" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (2)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" tabindex="0" role="button" width="602" height="407" src="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" class="attachment-large size-large wp-image-7897" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png 602w, https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png 300w, https://inero-software.com/wp-content/uploads/2025/04/image-2-1-444x300.png 444w" sizes="(max-width: 602px) 100vw, 602px" data-attachment-id="7897" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-2-2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" data-orig-size="602,407" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (2)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1-300x203.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-2-1.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-422566b elementor-widget elementor-widget-heading" data-id="422566b" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">OpenAI API and ChatGPT Enterprise</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-0be90ca elementor-widget elementor-widget-text-editor" data-id="0be90ca" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>If you&#8217;re a developer or a business using <strong>OpenAI&#8217;s API</strong> or <strong>ChatGPT Enterprise</strong>, there’s no need to opt out. By default, <strong>OpenAI does not use API or Enterprise data to train its models</strong>, and <strong>your data stays private</strong>. You don’t need to do anything to opt out &#8211; it’s already protected. You can choose to share data to help improve the model, but only if you want to.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d497044 elementor-widget elementor-widget-heading" data-id="d497044" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">Data Residency  </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-bd96e19 elementor-widget elementor-widget-text-editor" data-id="bd96e19" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">OpenAI’s servers are mostly based in the </span><strong>United States</strong><span data-contrast="auto"><strong>,</strong> and currently, if you&#8217;re using the API directly, </span><strong>you can’t choose where your data is stored</strong><span data-contrast="auto"><strong>.</strong> That means your data is processed within OpenAI’s own infrastructure &#8211; protected by strong security, but </span><strong>not necessarily hosted in your country. </strong></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">However, there’s some progress for enterprise users. OpenAI recently introduced an option for </span><strong>eligible enterprise API</strong><b><span data-contrast="auto"> customers</span></b><span data-contrast="auto"> that allows data to be stored in </span><strong>Europe</strong><span data-contrast="auto">, provided there’s a specific agreement in place.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><strong>If regional data residency</strong><span data-contrast="auto"> is important for your business &#8211; say, for GDPR or internal compliance &#8211; you might want to consider using </span><strong>Azure OpenAI</strong><span data-contrast="auto">, which hosts OpenAI’s models on Microsoft’s cloud. With Azure, you can choose a region like </span><strong>Western Europe or Asia</strong><span data-contrast="auto"><strong>,</strong> and all data processing and storage will stay within that geography.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We’ll dive into Azure more in the next section &#8211; but in short: </span><strong>OpenAI handles your data securely</strong><span data-contrast="auto">, but for strict control over </span><i><span data-contrast="auto">where</span></i><span data-contrast="auto"> it lives, </span><b><span data-contrast="auto">a</span></b><strong> partner cloud service like Azure may be a better fit. </strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5c3e1ec elementor-widget elementor-widget-spacer" data-id="5c3e1ec" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0a6231f elementor-widget elementor-widget-heading" data-id="0a6231f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Google (Gemini) – Google’s Approach to Your Data </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-c7c8d68 elementor-widget elementor-widget-text-editor" data-id="c7c8d68" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Google’s foray into generative AI includes </span><strong>Gemini</strong><span data-contrast="auto">, a next-generation model that powers products like Google Gemini (the chatbot) and various enterprise AI offerings on Google Cloud. Here&#8217;s how they handle your data:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><h5><b><span data-contrast="auto">Gemini App</span></b><span data-ccp-props="{}"> </span></h5><div><span data-ccp-props="{}"> </span></div><p><strong>By default, Google does save your Gemini chat history to your account (much like search history) and may use it to improve their service. However, Google provides a “Gemini Activity” setting to control this</strong><span data-contrast="auto"><strong>.</strong> </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">To manage this:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-83c9f69 elementor-widget elementor-widget-text-editor" data-id="83c9f69" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Visit </span><strong>Gemini Activity</strong><span data-contrast="auto"> settings.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">Pause Gemini Activity to stop saving chats and prevent them from being used in </span><strong>AI model training data sources. </strong></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="5" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">You can also delete existing conversation history.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-13a61aa elementor-widget elementor-widget-text-editor" data-id="13a61aa" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><a href="https://support.google.com/gemini/answer/13594961#your_data"><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">T</span></span><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">urning off </span><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink">Gemini</span></span><span class="TextRun Underlined SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW259565000 BCX0" data-ccp-charstyle="Hyperlink"> Activity</span></span></a><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0"> means </span></span><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0">your new chats </span><span class="NormalTextRun SCXW259565000 BCX0">won’t</span><span class="NormalTextRun SCXW259565000 BCX0"> be used to improve the</span><span class="NormalTextRun SCXW259565000 BCX0">ir</span> <span class="NormalTextRun SCXW259565000 BCX0">machine learning services</span></span><span class="TextRun SCXW259565000 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259565000 BCX0">, nor will they be seen by human reviewers, </span><span class="NormalTextRun SCXW259565000 BCX0">unless</span><span class="NormalTextRun SCXW259565000 BCX0"> you explicitly </span><span class="NormalTextRun SCXW259565000 BCX0">submit</span><span class="NormalTextRun SCXW259565000 BCX0"> them as feedback. This gives regular us</span><span class="NormalTextRun SCXW259565000 BCX0">ers a way </span><span class="NormalTextRun SCXW259565000 BCX0">to opt out, </span><span class="NormalTextRun SCXW259565000 BCX0">similar to</span><span class="NormalTextRun SCXW259565000 BCX0"> ChatGPT’s opt-out toggle.</span></span><span class="EOP SCXW259565000 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cb94ad1 elementor-widget elementor-widget-image" data-id="cb94ad1" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img decoding="async" data-attachment-id="7901" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" data-orig-size="712,332" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (1)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" tabindex="0" role="button" width="712" height="332" src="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" class="attachment-large size-large wp-image-7901" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/image-1.png 712w, https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png 300w, https://inero-software.com/wp-content/uploads/2025/04/image-1-643x300.png 643w" sizes="(max-width: 712px) 100vw, 712px" data-attachment-id="7901" data-permalink="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/image-1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" data-orig-size="712,332" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image (1)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/image-1-300x140.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/image-1.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-e57f550 elementor-widget elementor-widget-text-editor" data-id="e57f550" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">To stop saving your conversations, go to the </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">Activity </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">tab and toggle </span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">Gemini Apps Activity</span></span><span class="TextRun SCXW161006688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW161006688 BCX0">. </span><span class="NormalTextRun SCXW161006688 BCX0">You can also </span><span class="NormalTextRun SCXW161006688 BCX0">delete</span><span class="NormalTextRun SCXW161006688 BCX0"> your past conversations.</span></span><span class="EOP SCXW161006688 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ca02a5a elementor-widget elementor-widget-heading" data-id="ca02a5a" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">API and Vertex AI </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-73c636b elementor-widget elementor-widget-text-editor" data-id="73c636b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0">If </span><span class="NormalTextRun SCXW147481227 BCX0">you’re</span><span class="NormalTextRun SCXW147481227 BCX0"> using Google Cloud’s </span></span><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0">Vertex AI</span></span><span class="TextRun SCXW147481227 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW147481227 BCX0"> platform:</span></span><span class="EOP SCXW147481227 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5da483c elementor-widget elementor-widget-text-editor" data-id="5da483c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><span data-contrast="auto">Your prompts and outputs are </span><strong>not used to train AI models</strong><span data-contrast="auto"> without explicit permission.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><span data-contrast="auto">Data may be cached briefly (up to 24 hours) for performance but remains within your selected geographic region.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><span data-contrast="auto">Businesses can opt for a </span><strong>zero-retention policy</strong><span data-contrast="auto"> for maximum privacy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-15e5fa7 elementor-widget elementor-widget-heading" data-id="15e5fa7" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data residency  </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-6769505 elementor-widget elementor-widget-text-editor" data-id="6769505" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW242043066 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW242043066 BCX0">Data residency is a strong point for Google: you can choose which geographic region your AI service runs in (e.g. </span><span class="NormalTextRun SCXW242043066 BCX0">EU or U</span><span class="NormalTextRun SCXW242043066 BCX0">S data </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242043066 BCX0">centers</span><span class="NormalTextRun SCXW242043066 BCX0">), and Google will process and store data in that region to meet any data localization requirements.</span></span><span class="EOP SCXW242043066 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-dffefeb elementor-widget elementor-widget-spacer" data-id="dffefeb" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-e7c3b12 elementor-widget elementor-widget-heading" data-id="e7c3b12" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Microsoft Azure OpenAI – Enterprise Data Protection by Design </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-36c3a52 elementor-widget elementor-widget-heading" data-id="36c3a52" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Training Policy </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-657a095 elementor-widget elementor-widget-text-editor" data-id="657a095" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Microsoft’s Azure OpenAI Service lets companies use OpenAI’s models through the trusted Azure cloud platform. </span><strong>Privacy is a major selling point here</strong><span data-contrast="auto"><strong>.</strong> Microsoft is very explicit: </span><strong>any data you send into Azure OpenAI is not used to train the underlying models</strong><span data-contrast="auto"> or improve Microsoft’s or OpenAI’s services</span><span data-ccp-props="{&quot;134245418&quot;:true,&quot;134245529&quot;:true}"> .</span></p><p><span data-ccp-props="{&quot;134245418&quot;:true,&quot;134245529&quot;:true}"> </span></p><p><span data-contrast="none">Microsoft’s Azure OpenAI Service essentially hosts OpenAI’s models (GPT-4, GPT-3.5, etc.) within the Microsoft Azure cloud. Microsoft has specifically designed this service for enterprises that require strong privacy protections. Key aspects are:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-96820b8 elementor-widget elementor-widget-text-editor" data-id="96820b8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">Any data you input into Azure OpenAI – prompts, completions (model outputs), embeddings, fine-tuning data – is not used to train the AI models. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">Your inputs and outputs “are NOT available to other customers, are NOT available to OpenAI, and are NOT used to improve OpenAI models”. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">Microsoft only retains data as needed to provide the service and monitor for misuse. In fact, prompts and outputs on Azure are stored only temporarily (up to 30 days) by default, and solely for abuse detection purposes. After 30 days, those prompts are deleted. If even this temporary storage is a concern (say, for ultra-sensitive data), Microsoft offers a process called “modified abuse monitoring” where you can request that even the 30-day storage be bypassed, meaning no prompts are retained at all. Typically, you’d need approval for this exception, but it’s an option for high-security scenarios.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-5e2b615 elementor-widget elementor-widget-heading" data-id="5e2b615" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data Residency </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-82ccf7d elementor-widget elementor-widget-text-editor" data-id="82ccf7d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW93588553 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW93588553 BCX0">Because </span><span class="NormalTextRun SCXW93588553 BCX0">it’s</span><span class="NormalTextRun SCXW93588553 BCX0"> on Azure, you also </span><span class="NormalTextRun SCXW93588553 BCX0">benefit</span><span class="NormalTextRun SCXW93588553 BCX0"> from easily choosing the region and </span><span class="NormalTextRun SCXW93588553 BCX0">complying with</span><span class="NormalTextRun SCXW93588553 BCX0"> data residency requirements. When setting up Azure OpenAI, you deploy the service to an Azure region (for example, East US, West Europe, Southeast Asia, etc.). All processing and data storage for inference will occur within that region or its geographical boundary. So, if you deploy in Western Europe, your data </span><span class="NormalTextRun SCXW93588553 BCX0">isn’t</span><span class="NormalTextRun SCXW93588553 BCX0"> leaving Europe </span><span class="NormalTextRun SCXW93588553 BCX0">&#8211;</span><span class="NormalTextRun SCXW93588553 BCX0"> crucial for GDPR compliance. Azure itself meets </span><span class="NormalTextRun SCXW93588553 BCX0">numerous</span><span class="NormalTextRun SCXW93588553 BCX0"> compliance standards (SOC 2, ISO 27001, </span><span class="NormalTextRun SCXW93588553 BCX0">etc.)</span><span class="NormalTextRun SCXW93588553 BCX0">, and these extend to Azure OpenAI as an Azure service.</span></span><span class="EOP SCXW93588553 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-cea2902 elementor-widget elementor-widget-spacer" data-id="cea2902" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-0013609 elementor-widget elementor-widget-heading" data-id="0013609" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Anthropic (Claude) – A Privacy-First AI Assistant </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-6f1b8b4 elementor-widget elementor-widget-heading" data-id="6f1b8b4" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Training Policy </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-988001e elementor-widget elementor-widget-text-editor" data-id="988001e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW126360551 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW126360551 BCX0">Anthropic, the company behind the Claude AI assistant (Claude 2 and newer versions), has emphasized a privacy-conscious approach from the outset. </span><span class="NormalTextRun SCXW126360551 BCX0">Anthropic adopts an opt-in approach:</span></span><span class="EOP SCXW126360551 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3f7e219 elementor-widget elementor-widget-text-editor" data-id="3f7e219" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">By default, Anthropic does not use your conversations or data to train its models. This applies to both their commercial offerings (</span><a href="https://privacy.anthropic.com/en/collections/10663361-commercial-customers"><span data-contrast="none">Claude for Work, Anthropic API</span></a><span data-contrast="none">)</span> <span data-contrast="none">and consumer products (Claude Free, Claude Pro)</span> <span data-contrast="none">– your prompts and Claude’s responses aren’t automatically used for model training. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">They only use data if you deliberately opt-in, such as by providing explicit feedback. For instance, if you click a thumbs-up/down in a Claude interface or send data to their feedback channels, you’re essentially saying “you can learn from this”.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-00d6994 elementor-widget elementor-widget-text-editor" data-id="00d6994" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW11925797 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW11925797 BCX0">For enterprise clients, Anthropic offers Claude Team/Enterprise, which not only guarantees no training on your data but also provides admin controls. One such feature is custom data retention settings. By default, Anthropic’s systems might </span><span class="NormalTextRun SCXW11925797 BCX0">retain</span><span class="NormalTextRun SCXW11925797 BCX0"> your inputs/outputs indefinitely for your account (though not for training). However, Claude Enterprise admins can set a retention policy – for example, you might set it to </span><span class="NormalTextRun SCXW11925797 BCX0">delete</span><span class="NormalTextRun SCXW11925797 BCX0"> all conversation data after 30 days, 60 days, etc., with 30 days being the current minimum. These controls aim to support compliance with regulations like GDPR.</span></span><span class="EOP SCXW11925797 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5a24ccc elementor-widget elementor-widget-heading" data-id="5a24ccc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h5 class="elementor-heading-title elementor-size-default">Data Residency  </h5>		</div>
				</div>
				<div class="elementor-element elementor-element-3cf1dac elementor-widget elementor-widget-text-editor" data-id="3cf1dac" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW47979688 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47979688 BCX0">Anthropic is a newer player, and currently, when you use their API directly, you </span><span class="NormalTextRun SCXW47979688 BCX0">don’t</span><span class="NormalTextRun SCXW47979688 BCX0"> explicitly choose a data region – </span><span class="NormalTextRun SCXW47979688 BCX0">it’s</span> <span class="NormalTextRun SCXW47979688 BCX0">likely hosted</span><span class="NormalTextRun SCXW47979688 BCX0"> in the US by Anthropic (or </span><span class="NormalTextRun SCXW47979688 BCX0">possibly through</span><span class="NormalTextRun SCXW47979688 BCX0"> cloud providers like AWS in the US region). However, Anthropic models are also available through partners, which can help with data residency. For example, Anthropic’s Claude is offered via Amazon Bedrock (AWS’s AI service) and via Google Cloud Vertex AI. If you use Claude through one of these platforms, you can take advantage of AWS’s or Google’s region controls.</span></span><span class="EOP SCXW47979688 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-a2a60c8 elementor-widget elementor-widget-spacer" data-id="a2a60c8" data-element_type="widget" data-widget_type="spacer.default">
				<div class="elementor-widget-container">
					<div class="elementor-spacer">
			<div class="elementor-spacer-inner"></div>
		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-de688f7 elementor-widget elementor-widget-heading" data-id="de688f7" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-9f1b51c elementor-widget elementor-widget-text-editor" data-id="9f1b51c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Understanding the </span><strong>data collection practices of LLM providers</strong><span data-contrast="auto"> is crucial for<b> </b></span><strong>AI compliance</strong><span data-contrast="auto">, customer trust, and corporate governance. Whether you&#8217;re focused on compliance, customer trust, or internal data governance, these insights help you make informed decisions. Choose providers that align with your privacy values &#8211; and always review your settings.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Here&#8217;s a comparison of major platforms:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4597621 elementor-widget elementor-widget-html" data-id="4597621" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  table {
    width: 100%;
    border-collapse: collapse;
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
  }
  th, td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    vertical-align: top;
  }
  th {
    background-color: #f2f2f2;
  }
  a {
    color: #1C244B;
    text-decoration: underline;
  }
</style>

<table>
  <thead>
    <tr>
      <th>Provider</th>
      <th>Default Data Training</th>
      <th>Web App Setting</th>
      <th>Data Residency Options</th>
      <th>GDPR/CCPA Compliance</th>
      <th>Privacy Policy</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>OpenAI</td>
      <td>No (API)</td>
      <td>Opt-out available</td>
      <td>No; (unless used via Azure Microsoft)</td>
      <td>Yes</td>
      <td><a href="https://openai.com/policies/privacy-policy" target="_blank">Consumer privacy</a></td>
    </tr>
    <tr>
      <td>Google</td>
      <td>No (Cloud + Gemini)</td>
      <td>No training by default</td>
      <td>Broad region control</td>
      <td>Yes</td>
      <td>
        <a href="https://policies.google.com/privacy" target="_blank">Enterprise privacy</a>, 
        <a href="https://www.google.com/intl/en_us/gemini/privacy" target="_blank">Gemini privacy</a>, 
        <a href="https://cloud.google.com/vertex-ai/docs/general/privacy-overview" target="_blank">Vertex AI</a>
      </td>
    </tr>
    <tr>
      <td>Azure</td>
      <td>No</td>
      <td>N/A</td>
      <td>Full regional control</td>
      <td>Yes</td>
      <td><a href="https://privacy.microsoft.com/en-us/privacystatement" target="_blank">Azure, OpenAI privacy</a></td>
    </tr>
    <tr>
      <td>Anthropic</td>
      <td>No</td>
      <td>No training by default</td>
      <td>No (unless used via partners)</td>
      <td>Yes</td>
      <td>
        <a href="https://www.anthropic.com/legal/privacy" target="_blank">API users</a>, 
        <a href="https://claude.ai/privacy" target="_blank">Claude.ai users</a>
      </td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-5234314 elementor-widget elementor-widget-text-editor" data-id="5234314" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0">For maximum privacy and control, </span></span><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0"><b>local deployment</b></span></span><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0"><b> </b>(on-premises models) is always an alternative. This avoids cloud storage concerns entirely.</span><span class="NormalTextRun SCXW227920897 BCX0"> You can read more about local deployment </span></span><a class="Hyperlink SCXW227920897 BCX0" href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW227920897 BCX0" data-ccp-charstyle="Hyperlink">here</span></span></a><span class="TextRun SCXW227920897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW227920897 BCX0">.</span></span><span class="EOP SCXW227920897 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7c7244d elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="7c7244d" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<div class="elementor-cta">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/03/cta-1903-1030x579.png);" role="img" aria-label="cta 1903"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Let's talk about AI agents 					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Ready to bring AI into your business? Let us help you get started.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<a class="elementor-cta__button elementor-button elementor-size-" href="https://inero-software.com/contact-us/">
						Contact us					</a>
					</div>
							</div>
						</div>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/ai-user-privacy-an-analysis-of-platform-policies/">AI User Privacy: An Analysis of Platform Policies</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7890</post-id>	</item>
		<item>
		<title>Top Lightweight LLMs for Local Deployment</title>
		<link>https://inero-software.com/top-lightweight-llms-for-local-deployment/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Thu, 17 Apr 2025 09:50:46 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[Lightweight LLMs]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7843</guid>

					<description><![CDATA[<p>In this post, we’ll explore several top open-source lightweight LLMs and how to run them on a local Windows PC—whether CPU-only or with a limited GPU—for document processing tasks. </p>
<p>Artykuł <a href="https://inero-software.com/top-lightweight-llms-for-local-deployment/">Top Lightweight LLMs for Local Deployment</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7843" class="elementor elementor-7843" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-cc31ada e-flex e-con-boxed e-con e-parent" data-id="cc31ada" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-2485c29 elementor-widget elementor-widget-html" data-id="2485c29" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
					</div>
				</div>
				<div class="elementor-element elementor-element-d3520b4 elementor-widget elementor-widget-text-editor" data-id="d3520b4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h5><strong><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0">Running large language models (LLMs) on your own hardware has become increasingly </span></span></span><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0">feasible</span></span></span><span class="TrackedChange SCXW35608661 BCX0"><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun TrackChangeDeleteHighlight SCXW35608661 BCX0"> thanks to </span></span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">lightweight LLMs</span><span class="NormalTextRun SCXW35608661 BCX0">—models w</span><span class="NormalTextRun SCXW35608661 BCX0">ith</span> <span class="NormalTextRun SCXW35608661 BCX0">relatively small</span><span class="NormalTextRun SCXW35608661 BCX0"> parameter counts that deliver </span><span class="NormalTextRun SCXW35608661 BCX0">strong performance</span><span class="NormalTextRun SCXW35608661 BCX0"> without requiring server-grade GPUs.</span><span class="NormalTextRun SCXW35608661 BCX0"> In this post, </span><span class="NormalTextRun SCXW35608661 BCX0">we’ll</span><span class="NormalTextRun SCXW35608661 BCX0"> explore several top open-source lightweight LLMs and how to run them on a local Windows PC—whether CPU-only or with a limited GPU—for document processing tasks.</span> </span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">We also include a </span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">benchmark comparing the models</span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0"> in terms of </span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">accuracy and inference speed</span></span><span class="TextRun SCXW35608661 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW35608661 BCX0">, helping you choose the right model for your local environment and use case.</span></span><span class="EOP SCXW35608661 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:299,&quot;335559739&quot;:299}"> </span></strong></h5>						</div>
				</div>
				<div class="elementor-element elementor-element-10359f9 elementor-widget elementor-widget-heading" data-id="10359f9" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">What Are Lightweight LLMs (and Why Run Them Locally)? </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-621d40f elementor-widget elementor-widget-text-editor" data-id="621d40f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW177302101 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW177302101 BCX0">“Lightweight” LLMs are models typically in the range of ~1–8 billion parameters – far smaller than GPT-3 class models – often optimized to run on a single GPU or even CPU. They are usually released as open models with freely available weights. These models trade some raw power for efficiency, but recent research and clever engineering (better data, distilled training, efficient attention mechanisms, etc.) have dramatically improved their capabilities. Many can now match or beat much larger models on specific benchmarks</span><span class="NormalTextRun SCXW177302101 BCX0">.</span></span><span class="EOP SCXW177302101 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-81497fe elementor-widget elementor-widget-text-editor" data-id="81497fe" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Local deployment of such models is valuable for several reasons:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Privacy &amp; Security:</strong><span data-contrast="auto"> All data stays on your machine, which is crucial for confidential documents like insurance contracts. You’re not sending sensitive text to a third-party API.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Cost Savings:</strong><span data-contrast="auto"> Once downloaded, local models run </span><strong>for free</strong><span data-contrast="auto"> – no API usage fees or cloud compute bills. This can make a big difference if you process large volumes of documents regularly.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Latency &amp; Offline Access:</strong><span data-contrast="auto"> Local inference eliminates network latency. Responses can be near-instant on a GPU, and you can operate entirely offline. This is useful for on-site workflows or when internet access is restricted.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Customization:</strong><span data-contrast="auto"> With local models you have full control – you can adjust parameters, prompts, or fine-tune models to better fit your domain (e.g. insurance data) without vendor limits.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><p><span data-contrast="auto">In short, lightweight LLMs put AI capabilities directly in your hands, on hardware you own. Next, we’ll compare some of the leading open models that are well-suited for local document processing.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6e958d1 elementor-widget elementor-widget-heading" data-id="6e958d1" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Comparing Top Lightweight LLMs </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-adbf2c8 elementor-widget elementor-widget-text-editor" data-id="adbf2c8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW101152181 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW101152181 BCX0">Lightweight open-source large language models (LLMs) are becoming a practical choice for organizations looking to run AI workloads locally. They offer a strong balance between performance, speed, and resource requirements—making them ideal for document summarization, extraction, and classification without relying on cloud infrastructure. </span></span><span class="EOP SCXW101152181 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-330c9fe e-flex e-con-boxed e-con e-parent" data-id="330c9fe" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-73949bc elementor-widget elementor-widget-text-editor" data-id="73949bc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">We’ll focus on the following open-source models (each with downloadable checkpoints) that have a good reputation for quality relative to their size:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6703794 elementor-widget elementor-widget-text-editor" data-id="6703794" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong>Llama 3.1</strong><span data-contrast="auto"> – 8B parameters (Meta AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>StableLM Zephyr</strong><span data-contrast="auto"> – 3B parameters (Stability AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Llama 3.2</strong><span data-contrast="auto"> – 1B/3B parameters (Meta AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Mistral</strong><span data-contrast="auto"> – 7B parameters (Mistral AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><strong>Gemma 3</strong><span data-contrast="auto"> – 1B and 4B variants (Google DeepMind)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><strong>DeepSeek R1</strong><span data-contrast="auto"> – 1.5B and 7B variants (DeepSeek AI)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><strong>Phi-4 Mini</strong><span data-contrast="auto"> – 3.8B parameters (Microsoft)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="8" data-aria-level="1"><strong>TinyLlama</strong><span data-contrast="auto"> – 1.1B parameters (community project)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-f98ca55 elementor-widget elementor-widget-text-editor" data-id="f98ca55" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><span data-contrast="auto">These models range from very small (under 1 GB on disk) to mid-sized (~5 GB). All can be run in inference mode on a 16 GB GPU (often even in half-precision or 4-bit quantized form) and many are workable on CPU with enough RAM and patience. Table 1 summarizes their characteristics:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-71cd074 elementor-widget elementor-widget-html" data-id="71cd074" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  @import url('https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap');

  .model-table {
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
    border-collapse: collapse;
    width: 100%;
  }

  .model-table th, .model-table td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    color: #1C244B;
  }

  .model-table th {
    background-color: #f2f2f2;
  }
</style>

<table class="model-table">
  <thead>
    <tr>
      <th>Model</th>
      <th>Size on Disk (quantized)</th>
      <th>Max Context</th>
      <th>Licence</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Llama 3.1 (8B)</td>
      <td>4.9GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>StableLM Zephyr (3B)</td>
      <td>1.6GB</td>
      <td>4k tokens</td>
      <td>Only non-commercial use</td>
    </tr>
    <tr>
      <td>Llama 3.2 (3B)</td>
      <td>2.0GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>Mistral (7B)</td>
      <td>4.1GB</td>
      <td>32k tokens</td>
      <td>Open-source (Apache 2.0)</td>
    </tr>
    <tr>
      <td>Gemma 3 (4B)</td>
      <td>3.3GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>Gemma 3 (1B)</td>
      <td>0.8GB</td>
      <td>32k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>DeepSeek R1 (7B)</td>
      <td>4.7GB</td>
      <td>128k tokens</td>
      <td>Open-source (MIT licence)</td>
    </tr>
    <tr>
      <td>DeepSeek R1 (1.5B)</td>
      <td>1.1GB</td>
      <td>128k tokens</td>
      <td>Open-source (MIT licence)</td>
    </tr>
    <tr>
      <td>Phi-4 Mini (3.8B)</td>
      <td>2.5GB</td>
      <td>128k tokens</td>
      <td>Open-source</td>
    </tr>
    <tr>
      <td>TinyLlama (1.1B)</td>
      <td>0.6GB</td>
      <td>2k tokens</td>
      <td>Open-source</td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-55c06b4 elementor-widget elementor-widget-text-editor" data-id="55c06b4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW254867370 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW254867370 BCX0">Table 1:</span></span><span class="TextRun SCXW254867370 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW254867370 BCX0"> Lightweight LLMs for local use – model sizes a</span><span class="NormalTextRun SCXW254867370 BCX0">nd</span> <span class="NormalTextRun SCXW254867370 BCX0">maximum</span><span class="NormalTextRun SCXW254867370 BCX0"> context windo</span><span class="NormalTextRun SCXW254867370 BCX0">w</span><span class="NormalTextRun SCXW254867370 BCX0">.</span></span><span class="EOP SCXW254867370 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-58e51e9 elementor-widget elementor-widget-text-editor" data-id="58e51e9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW7653520 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW7653520 BCX0">Notes:</span></span></strong><span class="TextRun SCXW7653520 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW7653520 BCX0"> “Max Context” is the maximum sequence length (tokens) the model can process in one go. </span></span><span class="EOP SCXW7653520 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-223eda5 elementor-widget elementor-widget-text-editor" data-id="223eda5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">Next, </span><span class="NormalTextRun SCXW99345828 BCX0">let’s</span><span class="NormalTextRun SCXW99345828 BCX0"> look at each model’s </span></span><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">pros and cons</span></span><span class="TextRun SCXW99345828 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW99345828 BCX0">, especially in the context of document tasks:</span></span><span class="EOP SCXW99345828 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7192f01 elementor-widget elementor-widget-text-editor" data-id="7192f01" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Llama 3.1 (8B)</strong><span data-contrast="auto"><strong>:</strong> Powerful general-purpose model; moderate size and strong multilingual capabilities. Heavy for CPU-only systems; requires chunking for long documents.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>StableLM Zephyr (3B)</strong><span data-contrast="auto"><strong>:</strong> Ultra-lightweight, good for basic QA/extraction. Limited by small parameter count and commercial license restrictions.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Llama 3.2 (3B)</strong><span data-contrast="auto">: Excellent summarization and retrieval; long context support (128k tokens). Smaller size affects complex reasoning accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Mistral (7B)</strong><span data-contrast="auto"><strong>:</strong> Best overall performer for its size; highly efficient inference. Ideal for detailed summarization tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="5" data-aria-level="1"><strong>Gemma 3 (4B/1B)</strong><span data-contrast="auto">: Offers multimodal capabilities and extensive multilingual support. The 4B model balances capability and speed; the 1B model best suited for simple tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="6" data-aria-level="1"><strong>DeepSeek R1 (7B/1.5B)</strong><span data-contrast="auto">: Balanced efficiency and comprehension for general NLP tasks; limited complex reasoning compared to Mistral.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="7" data-aria-level="1"><strong>Phi-4 Mini (3.8B)</strong><span data-contrast="auto">: Exceptional reasoning, math, and logical capabilities; perfect for analytical document processing. English-focused.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:1080,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="8" data-aria-level="1"><strong>TinyLlama (1.1B)</strong><span data-contrast="auto">: Extremely lightweight; suitable for basic text extraction/classification tasks. Limited contextual understanding.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-906c9d8 elementor-widget elementor-widget-text-editor" data-id="906c9d8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW259074413 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW259074413 BCX0">The models reviewed above cover a wide range of sizes and capabilities. Larger variants like Llama 3.1 and Mistral perform well on complex summarization and multilingual tasks but are less suited for CPU-only setups. Mid-sized models such as Llama 3.2 and Gemma 3 (4B) handle long inputs efficiently with reasonable performance. Smaller models, including </span><span class="NormalTextRun SpellingErrorV2Themed SCXW259074413 BCX0">TinyLlama</span><span class="NormalTextRun SCXW259074413 BCX0"> and </span><span class="NormalTextRun SpellingErrorV2Themed SCXW259074413 BCX0">StableLM</span><span class="NormalTextRun SCXW259074413 BCX0"> Zephyr, are lightweight and fast, making them practical for basic extraction or classification tasks.</span></span><span class="EOP SCXW259074413 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-013ecbc elementor-widget elementor-widget-heading" data-id="013ecbc" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Models Benchmarking: Document Extraction and Summarization </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f583b4c elementor-widget elementor-widget-text-editor" data-id="f583b4c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW65580225 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW65580225 BCX0">Here we outline a simple </span><span class="NormalTextRun SCXW65580225 BCX0">model </span><span class="NormalTextRun SCXW65580225 BCX0">benchmarking plan covering t</span><span class="NormalTextRun SCXW65580225 BCX0">wo</span><span class="NormalTextRun SCXW65580225 BCX0"> common document-processing tasks:</span></span><span class="EOP SCXW65580225 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-236155a elementor-widget elementor-widget-text-editor" data-id="236155a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li><strong> Information Extraction:</strong><span data-contrast="auto"> We evaluated how well each model can extract specific fields from a policy or certificate. Specifically, we prompted each model to find the </span><b><span data-contrast="auto">p</span></b><strong>olicy number, insured name</strong><span data-contrast="auto"><strong>,</strong> VAT ID, address and insurance period in the document text and return the structured output &#8211; clean JSON response with all the needed values.</span></li><li><strong> Summarization: </strong><span data-contrast="auto">Each model generated a concise summary of an insurance policy, covering key points such as coverage, exclusions, and conditions.We rated the summaries on clarity, correctness, factual accuracy and readability and penalized heavily fabricating information.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-02421da elementor-widget elementor-widget-text-editor" data-id="02421da" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentStart SCXW43958002 BCX0">We used 11 document</span><span class="NormalTextRun SCXW43958002 BCX0">s</span><span class="NormalTextRun SCXW43958002 BCX0"> and</span><span class="NormalTextRun SCXW43958002 BCX0"> </span><span class="NormalTextRun SCXW43958002 BCX0">ran all t</span><span class="NormalTextRun SCXW43958002 BCX0">ests using </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SpellingErrorV2Themed SCXW43958002 BCX0">Ollama</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">(</a></span><span class="NormalTextRun SCXW43958002 BCX0">you can read about </span><span class="NormalTextRun SCXW43958002 BCX0">running model with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW43958002 BCX0">Ollama</span> <span class="NormalTextRun CommentStart SCXW43958002 BCX0">here</span><span class="NormalTextRun SCXW43958002 BCX0">)</span><span class="NormalTextRun SCXW43958002 BCX0">.</span><span class="NormalTextRun SCXW43958002 BCX0"> </span><span class="NormalTextRun SCXW43958002 BCX0">The benchmarks were performed on a PC equipped with an</span><span class="NormalTextRun SCXW43958002 BCX0"> NVIDIA </span><span class="NormalTextRun SCXW43958002 BCX0">GeForce RTX 2060 </span><span class="NormalTextRun SCXW43958002 BCX0">and </span><span class="NormalTextRun SCXW43958002 BCX0">6</span><span class="NormalTextRun SCXW43958002 BCX0"> GB </span><span class="NormalTextRun SCXW43958002 BCX0">V</span><span class="NormalTextRun SCXW43958002 BCX0">RAM.</span> <span class="NormalTextRun SCXW43958002 BCX0">To ensure consistent results, each model was run with </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">temperature set to 0</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> for the extraction task (to produce deterministic outputs), and with a fixed </span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">temperature of 0.7</span></span><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0"> for summarization. For the extraction task, we also used </span></span><a class="Hyperlink SCXW43958002 BCX0" href="https://ollama.com/blog/structured-outputs" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW43958002 BCX0" data-ccp-charstyle="Hyperlink">structured outputs</span></span></a><span class="TextRun SCXW43958002 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW43958002 BCX0">:</span> </span><span class="EOP SCXW43958002 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f1a279a elementor-widget elementor-widget-text-editor" data-id="f1a279a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre> <br /><br /><span data-contrast="none">{</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"model"</span><span data-contrast="none">: </span><span data-contrast="none">"deepseek-r1:7b"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"prompt"</span><span data-contrast="none">: </span><span data-contrast="none">"You are an assistant that extracts insurance-related information from a given input text. You must extract and return only the following fields: - policy_number,- insurance_period,- insured (company or person name),- nip (tax identification number),- address (of the insured). Return the output as a **clean JSON object** — not as a string, not inside quotes, and without any commentary. If a field is missing, use 'Not found'. Document text: "</span><span data-contrast="none">,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span><br /><br /><span data-contrast="none">    </span><span data-contrast="none">"stream"</span><span data-contrast="none">: </span><b><span data-contrast="none">false</span></b><span data-contrast="none">,</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"format"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"object"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"properties"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"policy_number"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_start"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_end"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_nip"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      },</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_address"</span><span data-contrast="none">: {</span> <br /><span data-contrast="none">        </span><span data-contrast="none">"type"</span><span data-contrast="none">: </span><span data-contrast="none">"string"</span> <br /><span data-contrast="none">      }</span> <br /><span data-contrast="none">    },</span> <br /><span data-contrast="none">    </span><span data-contrast="none">"required"</span><span data-contrast="none">: [</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"policy_number"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_start"</span><span data-contrast="none">, </span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insurance_period_end"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_nip"</span><span data-contrast="none">,</span> <br /><span data-contrast="none">      </span><span data-contrast="none">"insured_address"</span> <br /><span data-contrast="none">    ]</span> <br /><span data-contrast="none">  }</span> <br /><span data-contrast="none">}</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335557856&quot;:16777215,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:270}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-a6fbca0 elementor-widget elementor-widget-image" data-id="a6fbca0" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7846" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/attachment/111553/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/111553.png" data-orig-size="1154,649" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="111553" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" class="attachment-large size-large wp-image-7846" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/111553-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/111553-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/111553.png 1154w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7846" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/attachment/111553/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/111553.png" data-orig-size="1154,649" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="111553" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/111553-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/111553-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-c923f73 elementor-widget elementor-widget-text-editor" data-id="c923f73" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW85460195 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW85460195 BCX0">Examples of insurance </span><span class="NormalTextRun SCXW85460195 BCX0">certifacates</span><span class="NormalTextRun SCXW85460195 BCX0">.</span></span><span class="EOP SCXW85460195 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-e9e7e62 elementor-widget elementor-widget-text-editor" data-id="e9e7e62" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">The table below presents the benchmark results.</span></span></strong> <span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">Extraction accuracy</span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0"> refers to the number of documents (out of 11) where the model successfully extracted all key fields. </span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW36022441 BCX0">Token/sec</span></span><span class="TextRun SCXW36022441 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"> <span class="NormalTextRun SCXW36022441 BCX0">indicates</span><span class="NormalTextRun SCXW36022441 BCX0"> the model’s inference speed — how quickly it generates responses.</span></span><span class="EOP SCXW36022441 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-e5f35c8 elementor-widget elementor-widget-html" data-id="e5f35c8" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<style>
  @import url('https://fonts.googleapis.com/css2?family=Roboto:wght@300&display=swap');

  .model-table {
    font-family: 'Roboto', sans-serif;
    font-weight: 300;
    font-size: 14px;
    color: #1C244B;
    border-collapse: collapse;
    width: 100%;
  }

  .model-table th, .model-table td {
    border: 1px solid #ccc;
    padding: 8px;
    text-align: left;
    color: #1C244B;
  }

  .model-table th {
    background-color: #f2f2f2;
  }

  .green-bg {
    background-color: #DFF0D8;
  }

  .red-bg {
    background-color: #F2DEDE;
  }
</style>

<table class="model-table">
  <thead>
    <tr>
      <th>Model</th>
      <th>Summarization</th>
      <th>Extraction Accuracy</th>
      <th>Tokens/sec</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Llama 3.1 (8B)</td>
      <td class="green-bg">High-quality, no hallucinations</td>
      <td>10/11</td>
      <td>13.49</td>
    </tr>
    <tr>
      <td>StableLM 3B</td>
      <td class="red-bg">Average quality, typos/hallucinations</td>
      <td>4/11</td>
      <td>56.51</td>
    </tr>
    <tr>
      <td>Llama 3.2 (3B)</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>8/11</td>
      <td>49.49</td>
    </tr>
    <tr>
      <td>Mistral 7B</td>
      <td>Extensive summary, factually correct</td>
      <td>8/11</td>
      <td>29.01</td>
    </tr>
    <tr>
      <td>Gemma 3 4B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>10/11</td>
      <td>13.37</td>
    </tr>
    <tr>
      <td>Gemma 3 1B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>4/11</td>
      <td>73.46</td>
    </tr>
    <tr>
      <td>DeepSeek 7B</td>
      <td class="green-bg">Concise yet comprehensive summary, no hallucinations</td>
      <td>6/11</td>
      <td>16.39</td>
    </tr>
    <tr>
      <td>DeepSeek 1.5B</td>
      <td class="red-bg">Very poor, frequent hallucinations/errors</td>
      <td>0/11</td>
      <td>66.45</td>
    </tr>
    <tr>
      <td>Phi-4 Mini 3.8B</td>
      <td>Very concise summaries, factually correct</td>
      <td>9/11</td>
      <td>39.31</td>
    </tr>
    <tr>
      <td>TinyLlama 1.1B</td>
      <td class="red-bg">Poor quality, severe hallucinations</td>
      <td>2/11</td>
      <td>107.34</td>
    </tr>
  </tbody>
</table>
		</div>
				</div>
				<div class="elementor-element elementor-element-4f30579 elementor-widget elementor-widget-text-editor" data-id="4f30579" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW220458249 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW220458249 BCX0">Table 2: </span><span class="NormalTextRun SCXW220458249 BCX0">B</span><span class="NormalTextRun SCXW220458249 BCX0">enchmarking results.</span></span><span class="TextRun SCXW220458249 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW220458249 BCX0"> </span></span><span class="EOP SCXW220458249 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-1046393 elementor-widget elementor-widget-image" data-id="1046393" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7847" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/lightweight-llm-scatterplot/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png" data-orig-size="1968,1180" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="lightweight-llm-scatterplot" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" tabindex="0" role="button" width="1030" height="618" src="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" class="attachment-large size-large wp-image-7847" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png 300w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-768x460.png 768w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1536x921.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-500x300.png 500w, https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png 1968w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7847" data-permalink="https://inero-software.com/top-lightweight-llms-for-local-deployment/lightweight-llm-scatterplot/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot.png" data-orig-size="1968,1180" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="lightweight-llm-scatterplot" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-300x180.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/lightweight-llm-scatterplot-1030x618.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-704e9c5 elementor-widget elementor-widget-text-editor" data-id="704e9c5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0">This scatterplot visualizes the </span></span><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0">trade-off between extraction accuracy and inference speed</span></span><span class="TextRun SCXW241422309 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW241422309 BCX0"> (measured in tokens per second)</span></span><span class="EOP SCXW241422309 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5527166 elementor-widget elementor-widget-text-editor" data-id="5527166" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">The benchmarking results reveal significant variations among the tested models. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>Bottom-right</strong><span data-contrast="auto"> models &#8211; </span><strong>Llama 3.1 (8B), Gemma 3 (4B)</strong><span data-contrast="auto">, and </span><strong>Phi-4 Mini (3.8B)</strong> <span data-contrast="auto">&#8211; </span><span data-contrast="auto">excel in summarization quality and extraction accuracy, consistently providing concise and accurate outputs. Phi-4 Mini seems to offer a good trade-off between speed and accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Mistral 7B, DeepSeek 7B, Llama 3.2</strong><span data-contrast="auto"> generate detailed and informative summaries, though their extraction performance is more moderate.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="-" data-font="Aptos" data-listid="6" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Aptos&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;-&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">On the other hand, </span><strong>smaller models</strong> <span data-contrast="auto">(on the top-left side of the chart) like </span><strong><i>StableLM Zephyr (3B), Gemma 3 (1B)</i> and <i>TinyLlama</i></strong><i><span data-contrast="auto"> (1.1B)</span></i><span data-contrast="auto"> show significantly weaker extraction accuracy and are prone to frequent hallucinations. However, they benefit from faster inference times. Their limited context windows (e.g., 4k tokens) may contribute to these shortcomings. Overall, they may be suitable for only very basic tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1ac20ae elementor-widget elementor-widget-heading" data-id="1ac20ae" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Choosing the Right Model for Your Needs </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-11e1bfc elementor-widget elementor-widget-text-editor" data-id="11e1bfc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">When selecting a language model for document extraction or summarization, </span><span class="NormalTextRun SCXW204701935 BCX0">it’s</span><span class="NormalTextRun SCXW204701935 BCX0"> all about balancing </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">accuracy</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">, </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">speed</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">, and </span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">hardware constraints</span></span><span class="TextRun SCXW204701935 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW204701935 BCX0">. Below is a quick breakdown to help you pick the best fit—whether you need high precision, fast inference, or something lightweight for basic tasks.</span></span><span class="EOP SCXW204701935 BCX0" data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-689718c elementor-widget elementor-widget-text-editor" data-id="689718c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><strong>High Accuracy &amp; Reasonable Speed:</strong><span data-contrast="auto"> Choose </span><strong>Phi-4 Mini (3.8B), Gemma 3 (4B)</strong><span data-contrast="auto">, or </span><strong>Llama 3.1 (8B)</strong><span data-contrast="auto"> for robust extraction and summarization accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><strong>Fast Inference &amp; Moderate Accuracy:</strong><span data-contrast="auto"> Opt for </span><strong>Llama 3.2 (3B)</strong><span data-contrast="auto"> or </span><strong>StableLM Zephyr (3B)</strong><span data-contrast="auto"> for simpler tasks on limited hardware.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><strong>Balanced Performance (Accuracy-Speed Tradeoff): Mistral (7B)</strong><span data-contrast="auto"> provides strong general-purpose capability suitable for detailed document summarization tasks.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><strong>Low Resource Environments (Basic Tasks):</strong><span data-contrast="auto"> Consider </span><strong>TinyLlama (1.1B)</strong><span data-contrast="auto"> for quick extraction or classification on minimal hardware if accuracy isn&#8217;t critical.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-ee4c212 elementor-widget elementor-widget-heading" data-id="ee4c212" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-510ec3a elementor-widget elementor-widget-text-editor" data-id="510ec3a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW44846787 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW44846787 BCX0">Lightweight LLMs are increasingly </span><span class="NormalTextRun SCXW44846787 BCX0">viable</span><span class="NormalTextRun SCXW44846787 BCX0"> solutions for local deployment, particularly in document-intensive industries such as insurance. Models such as Phi-4 Mini, Gemma 3 (4B), and Mistral 7B provide </span><span class="NormalTextRun SCXW44846787 BCX0">strong performance</span><span class="NormalTextRun SCXW44846787 BCX0"> in summarization, extraction, and classification tasks. Carefully balancing model size, inference speed, and accuracy ensures </span><span class="NormalTextRun SCXW44846787 BCX0">optimal</span><span class="NormalTextRun SCXW44846787 BCX0"> outcomes, empowering organizations with affordable, private, and responsive AI solutions directly on owned hardware.</span></span><span class="EOP SCXW44846787 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8874a86 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="8874a86" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/optimization-of-back-office-processes-with-ai-agent-implementation-a-practical-example/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/03/cta-1903-1030x579.png);" role="img" aria-label="cta 1903"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						This might interest you					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Optimization of Back-Office Processes with AI Agent Implementation: A Practical Example					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Read the full text					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/top-lightweight-llms-for-local-deployment/">Top Lightweight LLMs for Local Deployment</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7843</post-id>	</item>
		<item>
		<title>How to Prepare Your Company for AI Agent Implementation</title>
		<link>https://inero-software.com/how-to-prepare-your-company-for-ai-agent-implementation/</link>
		
		<dc:creator><![CDATA[Marta Kuprasz]]></dc:creator>
		<pubDate>Tue, 08 Apr 2025 08:45:46 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[Ai agent]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[BusinessProcessesOptimization]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7741</guid>

					<description><![CDATA[<p>This article explains what to focus on before deploying an AI agent, which areas of the business need to be well-prepared, and how to avoid common mistakes.</p>
<p>Artykuł <a href="https://inero-software.com/how-to-prepare-your-company-for-ai-agent-implementation/">How to Prepare Your Company for AI Agent Implementation</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7741" class="elementor elementor-7741" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-baa7356 e-flex e-con-boxed e-con e-parent" data-id="baa7356" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-e8f2a57 elementor-widget elementor-widget-html" data-id="e8f2a57" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-a583053 elementor-widget elementor-widget-text-editor" data-id="a583053" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>Implementing an AI agent in a company is not only a technological challenge but also a strategic one. As more businesses consider using artificial intelligence in their daily operations—from customer service to document analysis—successful implementation requires careful planning. This article explains what to focus on before deploying an AI agent, which areas of the business need to be well-prepared, and how to avoid common mistakes.</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-068de27 elementor-widget elementor-widget-text-editor" data-id="068de27" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>There are many areas where AI can be helpful. From automating routine tasks, supporting customer service and data analysis, to streamlining decision-making processes and creating intelligent assistants that support team workflows. The potential is enormous—but the key lies in properly preparing the organization for this change.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-37b67aa elementor-widget elementor-widget-heading" data-id="37b67aa" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Stages of AI Assistant Implementation</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-42d1b2e elementor-widget elementor-widget-text-editor" data-id="42d1b2e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The process of implementing an AI assistant in an organization can be divided into several stages, each requiring specific actions. From analyzing business needs, selecting the right language model, and preparing the infrastructure, to integrating with existing systems and testing—each step impacts the overall effectiveness of the solution.</p><p>The key stages are:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-10d35ec elementor-widget elementor-widget-text-editor" data-id="10d35ec" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Needs analysis and readiness assessment</li><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Data and content preparation</li><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Solution design</li><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Assistant development and configuration</li><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Testing and pilot phase</li><li data-leveltext="%1." data-font="" data-listid="1" data-list-defn-props="{&quot;335552541&quot;:0,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769242&quot;:[65533,0],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;%1.&quot;,&quot;469777815&quot;:&quot;multilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1">Deployment and maintenance</li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-bc457df elementor-widget elementor-widget-heading" data-id="bc457df" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Needs analysis and readiness assessment</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-52ccfa0 elementor-widget elementor-widget-text-editor" data-id="52ccfa0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>To ensure the best results from implementing an AI agent, start by asking yourself: which tasks and areas have the most potential for optimization through the use of artificial intelligence?</p>						</div>
				</div>
				<div class="elementor-element elementor-element-2092bd9 elementor-widget elementor-widget-text-editor" data-id="2092bd9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>When looking for an answer to this question, it’s worth carefully analyzing your company’s current structure, processes, and employee responsibilities. This will help identify so-called “bottlenecks” that may affect the quality of services provided. These might include, for example:</p><ul><li style="list-style-type: none;"><ul><li><p>long response times to quote requests</p></li><li><p>teams overloaded with routine tasks</p></li><li><p>lack of consistency in customer communication</p></li><li><p>manual processing of documents and data</p></li><li><p>difficulties in quickly accessing internal company knowledge</p></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-bb5fac4 elementor-widget elementor-widget-text-editor" data-id="bb5fac4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Based on this analysis, you’ll be able to identify areas for improvement as well as the people who will directly benefit from the support of AI assistants.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-9a37800 elementor-widget elementor-widget-text-editor" data-id="9a37800" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The second area that should be reviewed is the existing infrastructure. Implementing an AI assistant doesn’t require a large amount of hardware. If the company doesn’t want to invest in new machines, it can opt to use cloud services such as Azure, AWS, or Google Cloud.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aacf051 elementor-widget elementor-widget-text-editor" data-id="aacf051" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Data is a crucial part of the preparation process. To fully leverage the potential of dedicated AI solutions, it’s important to understand that training the model behind the assistant requires datasets stored in digital form. These should be well-organized and kept in a central repository or database. The less structured the data, the higher the cost of implementing the assistant—and the greater the risk that the solution won’t meet expectations.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-60b308a elementor-widget elementor-widget-heading" data-id="60b308a" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Data and content preparation</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-255b41d elementor-widget elementor-widget-text-editor" data-id="255b41d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>At this stage, it’s essential to gather all materials that contain important company knowledge—this may include PDF, Word, and Excel documents, website content, FAQ sections, emails, or data from databases.</p><p> </p><p>Next, the collected information needs to be properly prepared—organized, cleaned of unnecessary content (e.g., unreadable PDFs), standardized where possible, and exported to CSV or JSON files (e.g., emails).</p><p> </p><p>In some cases, such as when planning further model customization (fine-tuning), it will also be necessary to label the data or prepare a dedicated training set in the form of instructions and expected responses, for example:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b8339ff elementor-widget elementor-widget-text-editor" data-id="b8339ff" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre>{"prompt": "What documents are required to sign an OCS agreement?", "response": "The following documents are required to sign an OCS agreement: ..."}</pre>						</div>
				</div>
				<div class="elementor-element elementor-element-301fbbb elementor-widget elementor-widget-heading" data-id="301fbbb" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Solution design</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-516c119 elementor-widget elementor-widget-text-editor" data-id="516c119" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>At this stage, decisions are made about the technical design of the solution. It’s important to define what type of assistant will best meet the company’s needs—whether it’s a simple chatbot answering questions, a more advanced assistant with access to company knowledge (so-called RAG – Retrieval-Augmented Generation), or an agent capable of independently performing specific tasks such as making bookings, generating reports, or sending emails.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-1b4b071 elementor-widget elementor-widget-text-editor" data-id="1b4b071" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The next step is selecting the appropriate technologies, including the large language model (LLM) that will power the assistant—such as GPT-4, Claude, Mistral, LLaMA, or Gemini—depending on specific needs and requirements related to privacy, cost, and integration capabilities.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d9610e4 elementor-widget elementor-widget-text-editor" data-id="d9610e4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Finally, it’s worth preparing a list of functions the assistant should perform and planning integration with other systems used in the company—such as the CRM, knowledge base, or email.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-a34770c elementor-widget elementor-widget-heading" data-id="a34770c" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Assistant development and configuration</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-ff744d0 elementor-widget elementor-widget-text-editor" data-id="ff744d0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>At this stage, both the technical backend and the user-facing part of the assistant (frontend) are developed. This could be, for example, a chat interface on the website, a button that launches the assistant in an application, or a widget integrated with tools like Slack. You can read more about how AI agent integration with the <a href="https://inero-software.com/optimization-of-back-office-processes-with-ai-agent-implementation-a-practical-example/">Slack communication platform can look here &gt;&gt;LINK</a></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3b817d5 elementor-widget elementor-widget-text-editor" data-id="3b817d5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>In parallel, the selected language model is deployed—via services such as Azure OpenAI, OpenAI API, Anthropic (Claude), Google Vertex AI (Gemini), or locally using open-source models like LLaMA, Mistral, or Mixtral.</p><p> </p><p>If the assistant is meant to use internal company knowledge, a RAG (Retrieval-Augmented Generation) mechanism needs to be configured—enabling it to search and match relevant documents to user queries.</p><p> </p><p>Finally, integrations with other systems—such as CRM, ticketing systems, or email—are implemented, allowing the assistant to meaningfully support the team’s day-to-day work.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-fa67de7 elementor-widget elementor-widget-heading" data-id="fa67de7" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Testing and pilot phase</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-56858f2 elementor-widget elementor-widget-text-editor" data-id="56858f2" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>After implementation, thorough testing of the solution is essential. The first step is functional testing—checking whether the assistant correctly understands user intent, responds in line with company documentation, and handles different types of queries appropriately.</p><p> </p><p>The next phase is testing with end users (UAT – User Acceptance Testing), which helps assess how well the assistant performs in real-world scenarios and whether it meets employees’ expectations.</p><p> </p><p>Based on feedback and observations, iterative improvements are made—such as adjusting responses, adding new documents to the knowledge base, or refining prompts and the agent’s logic. This phase is often repeated several times until a satisfactory level of quality is achieved.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-dd6864e elementor-widget elementor-widget-heading" data-id="dd6864e" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Deployment and maintenance</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-c10b8e9 elementor-widget elementor-widget-text-editor" data-id="c10b8e9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>After completing the testing phase, the assistant is deployed to the target infrastructure—this may be a public cloud (e.g., Azure, AWS, GCP), on-premise servers, or a hybrid solution, depending on security and availability requirements. More about this is covered later in the article.</p><p> </p><p>It’s also necessary to set up monitoring, which allows you to track things like token usage, query frequency, error rates, and the quality of generated responses. This enables quick issue resolution and cost optimization.</p><p>In daily use, it’s important to keep the data up to date—adding new documents, removing outdated information, and updating the knowledge base the assistant relies on.</p><p> </p><p>Over time, as business needs evolve, it may be worth considering retraining or fine-tuning the model—e.g., every few months—to better align it with the organization’s specific context.</p><p> </p><p>Finally, it’s important to provide technical support and user assistance to ensure the solution is not only technically reliable but also convenient and intuitive for everyday use.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-817acb3 elementor-widget elementor-widget-heading" data-id="817acb3" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Data privacy</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-20711ec elementor-widget elementor-widget-text-editor" data-id="20711ec" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>In the “Deployment and maintenance” section, we discussed the available options for choosing the infrastructure on which the AI agent will be deployed.</p><p>Each solution has its pros and cons. Choosing an on-premise setup gives you full control over the data, but it requires a dedicated machine with specific parameters.</p><p>Another option is using a public cloud service, such as Azure. Microsoft clearly states that data submitted to the Azure OpenAI service is not used to train or improve OpenAI or Microsoft models (<a href="https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy?tabs=azure-portal">source</a>).</p><p>According to Microsoft, prompts and responses are not shared with other customers or OpenAI. Azure operates in full isolation mode: when using GPT-4 on Azure, no information from your conversations is shared with OpenAI LLC. Microsoft has confirmed this in a Data Processing Addendum (DPA).</p>						</div>
				</div>
				<div class="elementor-element elementor-element-16536f6 elementor-widget elementor-widget-heading" data-id="16536f6" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">AI decision accountability</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-68f238b elementor-widget elementor-widget-text-editor" data-id="68f238b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>It’s important to remember that formal and legal responsibility for the outcomes of an AI agent’s actions and the data it processes lies with the entity that implemented and oversees the solution—most often.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-104a126 elementor-widget elementor-widget-text-editor" data-id="104a126" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li>the organization (e.g., the company that deployed the assistant),</li><li>the system administrator,</li><li>the individual making decisions based on AI suggestions (e.g., a customer service representative, recruiter, or doctor).</li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-85f731f elementor-widget elementor-widget-text-editor" data-id="85f731f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>How to reduce risk?</strong></p>						</div>
				</div>
				<div class="elementor-element elementor-element-6db7698 elementor-widget elementor-widget-text-editor" data-id="6db7698" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ol><li>Human-in-the-loop (HITL) – A human must approve important decisions, while AI only supports the process (e.g., the assistant drafts a response, but a person approves it).</li><li>Clear disclaimers and warnings – The AI should inform users: “I am an AI assistant – please verify my responses before making a decision.”</li><li>Source verification – The AI assistant should, where possible, cite sources for its answers or indicate when it doesn’t know rather than guessing. Using RAG enables precise control over the knowledge base.</li></ol>						</div>
				</div>
				<div class="elementor-element elementor-element-ab8fe88 elementor-widget elementor-widget-heading" data-id="ab8fe88" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Summary</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-32ea263 elementor-widget elementor-widget-text-editor" data-id="32ea263" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>The process of implementing an AI agent must be well-planned and carefully considered. It may seem challenging at first, but with proper preparation, it can deliver long-term benefits. If you need support, feel free to contact us.</strong></p><p><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-05552e5 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="05552e5" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/contact-inero-software-rd-software-house/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/02/cta-AI2-1030x579.png);" role="img" aria-label="cta AI2"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						AI Agent in Your Company?					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Write to us and find out how an AI Agent can support your company.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Contact					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/how-to-prepare-your-company-for-ai-agent-implementation/">How to Prepare Your Company for AI Agent Implementation</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7741</post-id>	</item>
		<item>
		<title>Deploying LLMs Locally: A Guide to Ollama and LM Studio</title>
		<link>https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Fri, 04 Apr 2025 08:53:42 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[api]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[CLI Tool]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[LM Studio]]></category>
		<category><![CDATA[Local deployment]]></category>
		<category><![CDATA[Ollama]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7692</guid>

					<description><![CDATA[<p>Whether you’re building a custom chatbot, agent, an AI-powered code assistant, or using AI to analyse documents offline, local deployment empowers you to experiment and innovate without relying on external services. </p>
<p>Artykuł <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">Deploying LLMs Locally: A Guide to Ollama and LM Studio</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7692" class="elementor elementor-7692" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-139e1f8 e-flex e-con-boxed e-con e-parent" data-id="139e1f8" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-2474df1 elementor-widget elementor-widget-html" data-id="2474df1" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-29e8b23 elementor-widget elementor-widget-text-editor" data-id="29e8b23" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4><span class="TextRun SCXW12802383 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW12802383 BCX0">Local deployment of Large Language Models (LLMs) is becoming increasingly popular among developers, tech enthusiasts, and professionals in industries like insurance and transport. Unlike cloud-based APIs, local LLM deployment offers greater privacy, offline accessibility, and complete control over resource optimization and inference performance.</span></span><span class="EOP SCXW12802383 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h4>						</div>
				</div>
				<div class="elementor-element elementor-element-d13f5fb elementor-widget elementor-widget-text-editor" data-id="d13f5fb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW230118114 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW230118114 BCX0">Running models like Llama 2 or Mistral directly on your hardware means your data stays on your machine — ideal for privacy-sensitive tasks such as processing insurance documents or working with proprietary transport data. There are no recurring API costs, and the performance depends solely on your system. Whether </span><span class="NormalTextRun SCXW230118114 BCX0">you&#8217;re</span><span class="NormalTextRun SCXW230118114 BCX0"> building a custom chatbot, </span><span class="NormalTextRun SCXW230118114 BCX0">agent, </span><span class="NormalTextRun SCXW230118114 BCX0">an AI-powered code assistant, or using AI to </span><span class="NormalTextRun SCXW230118114 BCX0">analyse</span><span class="NormalTextRun SCXW230118114 BCX0"> documents offline, local deployment empowers you to experiment and innovate without relying on external services.</span></span><span class="EOP SCXW230118114 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-313a919 elementor-widget elementor-widget-text-editor" data-id="313a919" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0">In this guide, </span><span class="NormalTextRun SCXW97631897 BCX0">we&#8217;ll</span><span class="NormalTextRun SCXW97631897 BCX0"> explore two powerful tools that make this possible: </span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW97631897 BCX0"><b>Ollama</b></span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0"> and </span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW97631897 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW97631897 BCX0">. </span><span class="NormalTextRun SCXW97631897 BCX0">We&#8217;ll</span><span class="NormalTextRun SCXW97631897 BCX0"> walk through installation, usage, and customization, helping you pick the best </span><span class="NormalTextRun SCXW97631897 BCX0">option</span><span class="NormalTextRun SCXW97631897 BCX0"> for your goals.</span></span><span class="EOP SCXW97631897 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:240,&quot;335559739&quot;:240}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-66f4910 elementor-widget elementor-widget-heading" data-id="66f4910" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Getting Started with Ollama (CLI Tool) </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-55390e0 elementor-widget elementor-widget-text-editor" data-id="55390e0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW101755402 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">Ollama</span><span class="NormalTextRun SCXW101755402 BCX0"> is a lightweight, open-source command-line tool for running LLMs locally. It acts as a model manager and runtime, making it easy to download and execute open-source models (like Llama 2, Mistral, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">CodeLlama</span><span class="NormalTextRun SCXW101755402 BCX0">, etc.) on your </span><span class="NormalTextRun SCXW101755402 BCX0">machine.</span> <span class="NormalTextRun SpellingErrorV2Themed SCXW101755402 BCX0">Ollama</span><span class="NormalTextRun SCXW101755402 BCX0"> is available for macOS, Linux, and Windows</span><span class="NormalTextRun SCXW101755402 BCX0">, and it includes a local REST API for integration into applications.</span></span><span class="EOP SCXW101755402 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f83e139 elementor-widget elementor-widget-text-editor" data-id="f83e139" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW107598507 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW107598507 BCX0">1.<b> Install </b></span><b><span class="NormalTextRun SpellingErrorV2Themed SCXW107598507 BCX0">Ollama</span><span class="NormalTextRun SCXW107598507 BCX0"> on Your System:</span></b></span><span class="TextRun SCXW107598507 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW107598507 BCX0"><b> </b>Download the installer for your platform from the official </span><span class="NormalTextRun SpellingErrorV2Themed SCXW107598507 BCX0">Ollama</span><span class="NormalTextRun SCXW107598507 BCX0"> website or use a package manager.</span></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f34ab2b elementor-widget elementor-widget-text-editor" data-id="f34ab2b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0">On Windows, download the </span></span><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0"><b>OllamaSetup.exe</b></span></span><span class="TextRun SCXW8978879 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8978879 BCX0"> from the website and run </span><span class="NormalTextRun SCXW8978879 BCX0">it.</span><span class="NormalTextRun SCXW8978879 BCX0"> On Linux, you can install </span><span class="NormalTextRun SpellingErrorV2Themed SCXW8978879 BCX0">Ollama</span><span class="NormalTextRun SCXW8978879 BCX0"> with one command:</span></span><span class="EOP SCXW8978879 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b668357 elementor-widget elementor-widget-text-editor" data-id="b668357" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0">curl -</span><span class="NormalTextRun SpellingErrorV2Themed SCXW8325834 BCX0">fsSL</span> </span><a class="Hyperlink SCXW8325834 BCX0" href="https://ollama.com/install.sh" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0" data-ccp-charstyle="Hyperlink">https://ollama.com/install.sh</span></span></a><span class="TextRun SCXW8325834 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW8325834 BCX0"> | </span><span class="NormalTextRun SpellingErrorV2Themed SCXW8325834 BCX0">sh</span></span><span class="EOP SCXW8325834 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c086d50 elementor-widget elementor-widget-text-editor" data-id="c086d50" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW172550952 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW172550952 BCX0">After installation, open a terminal/command prompt and verify </span><span class="NormalTextRun SCXW172550952 BCX0">it’s</span><span class="NormalTextRun SCXW172550952 BCX0"> installed by checking the version:</span></span><span class="EOP SCXW172550952 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-a98aedb elementor-widget elementor-widget-text-editor" data-id="a98aedb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW230245657 BCX0" lang="EN-GB" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; white-space-collapse: preserve; font-size: 11pt; line-height: 19.7625px; font-family: Consolas, Consolas_EmbeddedFont, Consolas_MSFontService, monospace; font-variant-ligatures: none !important;" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; background-position: 0px 100%; background-repeat: repeat-x; border-bottom: 1px solid transparent;">ollama</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;"> -</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;">-</span><span class="NormalTextRun SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text;">version</span></span><span class="EOP SCXW230245657 BCX0" style="-webkit-user-drag: none; -webkit-tap-highlight-color: transparent; margin: 0px; padding: 0px; user-select: text; white-space-collapse: preserve; font-size: 11pt; line-height: 19.7625px; font-family: Consolas, Consolas_EmbeddedFont, Consolas_MSFontService, monospace;" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-fe7e512 elementor-widget elementor-widget-text-editor" data-id="fe7e512" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW228829587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW228829587 BCX0">This should display the installed </span><span class="NormalTextRun SpellingErrorV2Themed SCXW228829587 BCX0">Ollama</span><span class="NormalTextRun SCXW228829587 BCX0"> version, confirming </span><span class="NormalTextRun SCXW228829587 BCX0">it’s</span><span class="NormalTextRun SCXW228829587 BCX0"> ready to </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW228829587 BCX0">use,</span><span class="NormalTextRun SCXW228829587 BCX0"> e.g.:</span></span><span class="EOP SCXW228829587 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-8f0123b elementor-widget elementor-widget-text-editor" data-id="8f0123b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW19868586 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW19868586 BCX0">ollama</span><span class="NormalTextRun SCXW19868586 BCX0"> version is 0.6.2</span></span><span class="EOP SCXW19868586 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-cf0e477 elementor-widget elementor-widget-text-editor" data-id="cf0e477" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW20221182 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW20221182 BCX0">2<b>. Download an LLM Model (&#8220;Pull&#8221; a Model)</b>:</span></span><span class="TextRun SCXW20221182 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"> <span class="NormalTextRun SpellingErrorV2Themed SCXW20221182 BCX0">Ollama</span><span class="NormalTextRun SCXW20221182 BCX0"> has a built-in model library. You can search their </span><span class="NormalTextRun SpellingErrorV2Themed SCXW20221182 BCX0">catalog</span><span class="NormalTextRun SCXW20221182 BCX0"> on the website or simply pull a known model by name. For example, to download the 7B parameter Llama 2 chat model, run:</span></span><span class="EOP SCXW20221182 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-abab4bd elementor-widget elementor-widget-text-editor" data-id="abab4bd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW86029186 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW86029186 BCX0">ollama</span><span class="NormalTextRun SCXW86029186 BCX0"> pull llama2:7b-chat</span></span><span class="EOP SCXW86029186 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-e31da10 elementor-widget elementor-widget-text-editor" data-id="e31da10" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW158953993 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW158953993 BCX0">This command fetches the model weights to your machine (it may take a while, as models are multiple GBs in </span><span class="NormalTextRun SCXW158953993 BCX0">size)</span><span class="NormalTextRun SCXW158953993 BCX0">. You only need to pull a model once; </span><span class="NormalTextRun SCXW158953993 BCX0">afterward</span> <span class="NormalTextRun SCXW158953993 BCX0">it’s</span><span class="NormalTextRun SCXW158953993 BCX0"> stored locally. You can list all downloaded models with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW158953993 BCX0">ollama</span><span class="NormalTextRun SCXW158953993 BCX0"> list if needed.</span></span><span class="EOP SCXW158953993 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0fe8c48 elementor-widget elementor-widget-text-editor" data-id="0fe8c48" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW87322540 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87322540 BCX0">3. Run the Model Locally:</span></span></strong><span class="TextRun SCXW87322540 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87322540 BCX0"> Once downloaded, you can execute the model with the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW87322540 BCX0">ollama</span><span class="NormalTextRun SCXW87322540 BCX0"> run command. This will launch an interactive session where you can enter prompts and get responses. For example:</span></span><span class="EOP SCXW87322540 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-bb58a0a elementor-widget elementor-widget-text-editor" data-id="bb58a0a" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW171041342 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW171041342 BCX0">ollama</span><span class="NormalTextRun SCXW171041342 BCX0"> run llama2:7b-chat &gt;&gt;&gt; What is the capital city of Poland?</span></span><span class="EOP SCXW171041342 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c17cc6e elementor-widget elementor-widget-text-editor" data-id="c17cc6e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">After running the above, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW99918251 BCX0">Ollama</span><span class="NormalTextRun SCXW99918251 BCX0"> will load the </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW99918251 BCX0">model</span><span class="NormalTextRun SCXW99918251 BCX0"> and </span><span class="NormalTextRun SCXW99918251 BCX0">you’ll</span><span class="NormalTextRun SCXW99918251 BCX0"> see an </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">&gt;&gt;&gt;</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> prompt. You can then type your questions or instructions. The model (here Llama 2 7B chat) will generate a response to each prompt. For instance, you might ask “What is the capital of France?” and get an answer like “Paris is the capital of France.” printed in the terminal. Internally, the first run may take a bit to initialize, but </span><span class="NormalTextRun SCXW99918251 BCX0">subsequent</span><span class="NormalTextRun SCXW99918251 BCX0"> prompts are answered interactively. </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">Tip:</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> You can also pass a one-off prompt directly in the command, e.g. </span><span class="NormalTextRun SpellingErrorV2Themed SCXW99918251 BCX0">ollama</span><span class="NormalTextRun SCXW99918251 BCX0"> run llama2:7b </span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0">&#8220;<b>What is the capital city of Poland?</b>&#8220;</span></span><span class="TextRun SCXW99918251 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW99918251 BCX0"> will output a single response and return to the </span><span class="NormalTextRun SCXW99918251 BCX0">shell.</span></span><span class="EOP SCXW99918251 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-846e51f elementor-widget elementor-widget-text-editor" data-id="846e51f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW252478220 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW252478220 BCX0">You can also start </span><span class="NormalTextRun SpellingErrorV2Themed SCXW252478220 BCX0">Ollama</span><span class="NormalTextRun SCXW252478220 BCX0"> as a background server with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW252478220 BCX0">ollama</span><span class="NormalTextRun SCXW252478220 BCX0"> serve. This enables the REST API on localhost:11434, which developers can use to integrate the model into apps via HTTP </span><span class="NormalTextRun SCXW252478220 BCX0">calls.</span><span class="NormalTextRun SCXW252478220 BCX0"> You can ask the model by sending POST request, e.g.:</span></span><span class="EOP SCXW252478220 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-7fa1a27 elementor-widget elementor-widget-text-editor" data-id="7fa1a27" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">curl </span></span><a class="Hyperlink SCXW24036424 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0"> -d </span></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">  "prompt": "What is the capital city of Poland?"</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW24036424 BCX0"><span class="SCXW24036424 BCX0"> </span><br class="SCXW24036424 BCX0" /></span><span class="TextRun SCXW24036424 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW24036424 BCX0">}'</span></span><span class="EOP SCXW24036424 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-77dd7b1 elementor-widget elementor-widget-text-editor" data-id="77dd7b1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW28772340 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW28772340 BCX0">The API returns newline-separated JSON objects, chunk by chunk, as the model generates the response:</span></span><span class="EOP SCXW28772340 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-943d033 elementor-widget elementor-widget-text-editor" data-id="943d033" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.1569954Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"The"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.268992Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" capital"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:17.3796491Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" city"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">...</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.3106413Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">" Warszawa"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.4619772Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">")."</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"model"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"llama2:7b-chat"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">created_at</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"2025-04-02T15:19:21.6296267Z"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"response"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">""</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"done"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">true</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">done_reason</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"stop"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">total_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 5337417000,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">load_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 8625100,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">prompt_eval_count</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 28,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">prompt_eval_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 854952300,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">eval_count</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 15,</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">    </span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">"</span><span class="NormalTextRun SpellingErrorV2Themed SCXW52386783 BCX0">eval_duration</span><span class="NormalTextRun SCXW52386783 BCX0">"</span></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">: 4472807400</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW52386783 BCX0"><span class="SCXW52386783 BCX0"> </span><br class="SCXW52386783 BCX0" /></span><span class="TextRun SCXW52386783 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52386783 BCX0">}</span></span><span class="EOP SCXW52386783 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-10d3ffe elementor-widget elementor-widget-text-editor" data-id="10d3ffe" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">If you set stream: </span></span><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">false</span></span><span class="TextRun SCXW26657317 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW26657317 BCX0">, the response is a single JSON object:</span></span><span class="EOP SCXW26657317 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-af79c25 elementor-widget elementor-widget-text-editor" data-id="af79c25" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">curl </span></span><a class="Hyperlink SCXW81302069 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0"> -d </span></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "prompt": "What is the capital city of Poland?",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">  "stream": false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW81302069 BCX0"><span class="SCXW81302069 BCX0"> </span><br class="SCXW81302069 BCX0" /></span><span class="TextRun SCXW81302069 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW81302069 BCX0">}</span></span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-3d76430 elementor-widget elementor-widget-text-editor" data-id="3d76430" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">You can also set </span><span class="NormalTextRun SCXW62602235 BCX0">a number of</span><span class="NormalTextRun SCXW62602235 BCX0"> model parameters such as </span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">temperature</span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0"> by adding field </span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">options</span></span><span class="TextRun SCXW62602235 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW62602235 BCX0">:</span></span><span class="EOP SCXW62602235 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-12fc8b0 elementor-widget elementor-widget-text-editor" data-id="12fc8b0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">curl </span></span><a class="Hyperlink SCXW121643900 BCX0" href="http://localhost:11434/api/generate" target="_blank" rel="noreferrer noopener"><span class="TextRun Underlined SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0" data-ccp-charstyle="Hyperlink">http://localhost:11434/api/generate</span></span></a><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0"> -d </span></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">'{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "model": "llama2:7b-chat",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "prompt": "What is the capital city of Poland?",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "options": {</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span> <span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">"temperature": 0.2  </span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  }</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">  "stream": false</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW121643900 BCX0"><span class="SCXW121643900 BCX0"> </span><br class="SCXW121643900 BCX0" /></span><span class="TextRun SCXW121643900 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW121643900 BCX0">}'</span></span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-c478ddd elementor-widget elementor-widget-text-editor" data-id="c478ddd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0">4. <b>Customize Models:</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b> </b><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0">Ollama</span><span class="NormalTextRun SCXW13767485 BCX0"> supports a </span><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0">Dockerfile</span><span class="NormalTextRun SCXW13767485 BCX0">-like syntax called a </span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW13767485 BCX0"><b>Modelfile</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0"><b> </b>to create </span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0"><b>custom LLM variants</b></span></span><span class="TextRun SCXW13767485 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW13767485 BCX0">. These let you:</span></span><span class="EOP SCXW13767485 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-dde64e1 elementor-widget elementor-widget-text-editor" data-id="dde64e1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">Start from an existing model (like </span><span data-contrast="none">llama3</span><span data-contrast="none">)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">Add custom system prompts</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">Inject user-defined data (e.g., instructions, context)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="none">Set model parameters, like temperature</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-9672691 elementor-widget elementor-widget-text-editor" data-id="9672691" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="none">Here is the simple example how you can create your custom assistant for processing insurance documents:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b3b173f elementor-widget elementor-widget-text-editor" data-id="b3b173f" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">FROM llama2:7b-chat</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">PARAMETER temperature 0.7</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">SYSTEM </span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">You are an assistant that extracts insurance-related information from a given input text.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">You must extract and return only the following fields:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- policy_number</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- insurance_period</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- insured (company or person name)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- nip (tax identification number)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">- address (of the insured)</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Return the output as a **clean JSON object** -- not as a string, not inside quotes, and without any commentary. If a field is missing, use "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Not found</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">".</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Example output format:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW168916518 BCX0">policy_number</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW168916518 BCX0">insurance_period</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">insured</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">nip</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">",</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">  "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">address</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">": "</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">...</span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">TEMPLATE </span></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{{ .</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW168916518 BCX0">System }</span><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Input:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">{{ .</span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW168916518 BCX0">Prompt }</span><span class="NormalTextRun SCXW168916518 BCX0">}</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">Response:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW168916518 BCX0"><span class="SCXW168916518 BCX0"> </span><br class="SCXW168916518 BCX0" /></span><span class="TextRun SCXW168916518 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW168916518 BCX0">"""</span></span><span class="EOP SCXW168916518 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.</pre>						</div>
				</div>
				<div class="elementor-element elementor-element-70f4001 elementor-widget elementor-widget-text-editor" data-id="70f4001" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="NormalTextRun SCXW237655292 BCX0">To use </span><span class="NormalTextRun SpellingErrorV2Themed SCXW237655292 BCX0">Makefile</span><span class="NormalTextRun SCXW237655292 BCX0">, save it in a directory, e.g. insurance-</span><span class="NormalTextRun SCXW237655292 BCX0">a</span><span class="NormalTextRun SCXW237655292 BCX0">ssistant</span><span class="NormalTextRun SCXW237655292 BCX0"> and create the custom model:</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5d4fca1 elementor-widget elementor-widget-text-editor" data-id="5d4fca1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW150813743 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">ollama</span><span class="NormalTextRun SCXW150813743 BCX0"> create insurance-assistant -f insurance-</span><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">assitant</span><span class="NormalTextRun SCXW150813743 BCX0">/</span><span class="NormalTextRun SpellingErrorV2Themed SCXW150813743 BCX0">Modelfile</span></span><span class="EOP SCXW150813743 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-3cfba04 elementor-widget elementor-widget-text-editor" data-id="3cfba04" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="none">Then, you can use your model by providing the proper model name in a request:</span> </p>						</div>
				</div>
				<div class="elementor-element elementor-element-607dc80 elementor-widget elementor-widget-text-editor" data-id="607dc80" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span><span data-contrast="none">curl </span><a href="http://localhost:11434/api/generate"><span data-contrast="none">http://localhost:11434/api/generate</span></a><span data-contrast="none"> -d </span><span data-contrast="none">'{</span> <br /><span data-contrast="none">  "model": "insurance-extractor",</span> <br /><span data-contrast="none">  "prompt": "",</span> <br /><span data-contrast="none">  "stream": false</span> <br /><span data-contrast="none">}'</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-bc5221c elementor-widget elementor-widget-text-editor" data-id="bc5221c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW210001513 BCX0">Ollama</span><span class="NormalTextRun SCXW210001513 BCX0"> is purely CLI-based, so </span><span class="NormalTextRun SCXW210001513 BCX0">there’s</span><span class="NormalTextRun SCXW210001513 BCX0"> no graphical interface. However, this makes it </span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">powerful for automation</span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0"> – you can pipe input/output, log responses to files, or call the </span><span class="NormalTextRun SpellingErrorV2Themed SCXW210001513 BCX0">Ollama</span><span class="NormalTextRun SCXW210001513 BCX0"> API from code. In summary, with just a few commands, you have a privacy-protecting LLM running on your PC, ready to answer questions or </span><span class="NormalTextRun SCXW210001513 BCX0">assist</span><span class="NormalTextRun SCXW210001513 BCX0"> in coding, all </span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">without any internet connection needed</span></span><span class="TextRun SCXW210001513 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW210001513 BCX0">.</span></span><span class="EOP SCXW210001513 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-ce97082 elementor-widget elementor-widget-heading" data-id="ce97082" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Getting Started with LM Studio (Desktop App) </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-f911779 elementor-widget elementor-widget-image" data-id="f911779" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7711" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" class="attachment-large size-large wp-image-7711" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM1-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM1.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7711" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM1-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-14c3b4c elementor-widget elementor-widget-text-editor" data-id="14c3b4c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><b><span data-contrast="none">LM Studio</span></b><span data-contrast="none"> is a user-friendly desktop application that lets you </span><b><span data-contrast="none">download and run local LLMs via a graphical interface</span></b><span data-contrast="none">. It’s cross-platform (Windows, macOS, Linux) and ideal for beginners who prefer not to use the command line. With LM Studio, you can chat with models in a nice UI, manage model downloads, and even run a local server to use the model in other apps.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p><p><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-623f7d7 elementor-widget elementor-widget-text-editor" data-id="623f7d7" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b>1. Install and Launch LM Studio:</b></span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"> Download the installer for your OS from the LM Studio website and install it. After installation, launch the </span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW122283343 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW122283343 BCX0"><b> app</b>. The first time you open </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW122283343 BCX0">it,</span> <span class="NormalTextRun SCXW122283343 BCX0">you’ll</span><span class="NormalTextRun SCXW122283343 BCX0"> be prompted to download an AI model. You can choose from a list of popular open-source models. For example, you might select a smaller model like “Mistral 7B” or an instruction-tuned Llama2 variant to start.</span></span><span class="EOP SCXW122283343 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-eef6a33 elementor-widget elementor-widget-text-editor" data-id="eef6a33" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">2. Run Your First Chat:</span></span></strong><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0"> Once the model is downloaded, LM Studio will load it into memory. You can then start a new chat session in the app. The interface typically has a text box where you can enter your prompt or question, and the model’s response will appear in the chat window. Simply type a query (for example: </span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">“What’s the capital of France?”</span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0"> or </span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">“Explain quantum physics simply.”</span></span><span class="TextRun SCXW160100961 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW160100961 BCX0">) and hit Enter. The AI’s answer will be displayed as the “Assistant” reply in the chat. LM Studio conveniently shows the generation metrics:</span></span><span class="EOP SCXW160100961 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-e83ff6c elementor-widget elementor-widget-text-editor" data-id="e83ff6c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="none">number of input and output tokens,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="none">tokens per second &#8211; you can see how fast the model is generating text,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="none">context occupancy,</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul><ul><li style="list-style-type: none;"><ul><li data-leveltext="" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="none">system resources usage (RAM and processor usage).</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-4c50be9 elementor-widget elementor-widget-text-editor" data-id="4c50be9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong><span class="TextRun SCXW47407587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47407587 BCX0">3. Explore the Features:</span></span></strong><span class="TextRun SCXW47407587 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW47407587 BCX0"> The LM Studio GUI provides </span><span class="NormalTextRun SCXW47407587 BCX0">additional</span><span class="NormalTextRun SCXW47407587 BCX0"> features accessible to both beginners and advanced users:</span></span><span class="EOP SCXW47407587 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0bc4155 elementor-widget elementor-widget-text-editor" data-id="0bc4155" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW37213095 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW37213095 BCX0">Model Library:</span></span></strong><span class="TextRun SCXW37213095 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW37213095 BCX0"> A “Discover Models” or </span><span class="NormalTextRun SpellingErrorV2Themed SCXW37213095 BCX0">catalog</span><span class="NormalTextRun SCXW37213095 BCX0"> section where you can download new models or update existing ones. </span><span class="NormalTextRun SCXW37213095 BCX0">You’re</span><span class="NormalTextRun SCXW37213095 BCX0"> not limited to one model – you can have multiple models stored and switch between them. This means you have a wide selection: from small 3B parameter models for speed, up to 70B models if your system can handle them.</span></span><span class="EOP SCXW37213095 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-c3af6d0 elementor-widget elementor-widget-text-editor" data-id="c3af6d0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW224090495 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW224090495 BCX0">Chat Interface:</span></span></strong><span class="TextRun SCXW224090495 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW224090495 BCX0"> The main chat screen (as shown above) is where you interact with the model. Each new prompt you enter is answered by the model in a conversational format. You can have multi-turn dialogues, just like chatting with ChatGPT. </span><span class="NormalTextRun SCXW224090495 BCX0">There’s</span><span class="NormalTextRun SCXW224090495 BCX0"> no need to manage a prompt history manually – the app keeps the conversation context.</span></span><span class="EOP SCXW224090495 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-fcf03f1 elementor-widget elementor-widget-text-editor" data-id="fcf03f1" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">Advanced Settings:</span></span></strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> On the side panel, LM Studio offers configuration knobs for those who want more control. You can set a </span></span><strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">system prompt</span></span></strong><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> (a role or instruction that guides the AI’s </span><span class="NormalTextRun SpellingErrorV2Themed SCXW52102321 BCX0">behavior</span><span class="NormalTextRun SCXW52102321 BCX0"> globally), adjust generation settings like </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">temperature</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> (creativity vs. consistency) and </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">top-p</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> or </span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0">top-k</span></span><span class="TextRun SCXW52102321 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW52102321 BCX0"> sampling for controlling randomness, max tokens for responses, etc. These options let you fine-tune how the model responds without writing any code. For instance, you could set a system instruction like “You are a helpful coding assistant,</span><span class="NormalTextRun SCXW52102321 BCX0">”.</span><span class="NormalTextRun SCXW52102321 BCX0"> This is a friendly way to customize </span><span class="NormalTextRun SpellingErrorV2Themed SCXW52102321 BCX0">behavior</span><span class="NormalTextRun SCXW52102321 BCX0">, though </span><span class="NormalTextRun SCXW52102321 BCX0">it’s</span><span class="NormalTextRun SCXW52102321 BCX0"> not as extensive as programmatic control in a CLI tool.</span></span><span class="EOP SCXW52102321 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-5378615 elementor-widget elementor-widget-image" data-id="5378615" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7710" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" class="attachment-large size-large wp-image-7710" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM2-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM2.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7710" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm2/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM2" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM2-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-8d32129 elementor-widget elementor-widget-text-editor" data-id="8d32129" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="NormalTextRun SCXW87331471 BCX0">Advanced settings – </span><span class="NormalTextRun SCXW87331471 BCX0">simple </span><span class="NormalTextRun SCXW87331471 BCX0">example of </span><span class="NormalTextRun SCXW87331471 BCX0">AI assistant</span><span class="NormalTextRun SCXW87331471 BCX0"> for processing insurance documents</span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-88a1e0d elementor-widget elementor-widget-text-editor" data-id="88a1e0d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW87021791 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87021791 BCX0"><strong>Local API Server</strong>:</span></span><span class="TextRun SCXW87021791 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW87021791 BCX0"> For developers, LM Studio includes a “Local LLM Server” mode. Just switch to Developer tab, choose the model, and toggle Start button. It enables an API endpoint on localhost that mimics the OpenAI API, allowing other programs to send requests to your local </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW87021791 BCX0">model.</span><span class="NormalTextRun SCXW87021791 BCX0"> This is powerful if you want to integrate the local LLM into your own applications (for example, connecting a chatbot UI or using the model for AI features in an IDE) while still </span><span class="NormalTextRun SCXW87021791 BCX0">benefiting</span><span class="NormalTextRun SCXW87021791 BCX0"> from privacy and not relying on external services.</span></span><span class="EOP SCXW87021791 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-369df09 elementor-widget elementor-widget-image" data-id="369df09" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="7709" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm3/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" tabindex="0" role="button" width="1030" height="579" src="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" class="attachment-large size-large wp-image-7709" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png 1030w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png 300w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-768x432.png 768w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-1536x864.png 1536w, https://inero-software.com/wp-content/uploads/2025/04/LLM3-533x300.png 533w, https://inero-software.com/wp-content/uploads/2025/04/LLM3.png 1920w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7709" data-permalink="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/llm3/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3.png" data-orig-size="1920,1080" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="LLM3" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-300x169.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/04/LLM3-1030x579.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-5937737 elementor-widget elementor-widget-text-editor" data-id="5937737" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h6><span class="TextRun SCXW123659257 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW123659257 BCX0">Developer tab</span><span class="NormalTextRun SCXW123659257 BCX0"> &#8211;</span><span class="NormalTextRun SCXW123659257 BCX0"> you can </span><span class="NormalTextRun SCXW123659257 BCX0">enable</span><span class="NormalTextRun SCXW123659257 BCX0"> local LLM server</span><span class="NormalTextRun SCXW123659257 BCX0"> hosting your customized LLM</span><span class="NormalTextRun SCXW123659257 BCX0">.</span></span><span class="EOP SCXW123659257 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559685&quot;:720,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></h6>						</div>
				</div>
				<div class="elementor-element elementor-element-06d2cee elementor-widget elementor-widget-text-editor" data-id="06d2cee" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0">Using LM Studio is as simple as </span><span class="NormalTextRun SpellingErrorV2Themed SCXW66765914 BCX0">chatGPT</span><span class="NormalTextRun SCXW66765914 BCX0"> – type and get answers – but entirely running on your hardware. The </span></span><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0">user-friendly interface</span></span><span class="TextRun SCXW66765914 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW66765914 BCX0"> lowers the barrier to </span><span class="NormalTextRun SCXW66765914 BCX0">entry, since</span><span class="NormalTextRun SCXW66765914 BCX0"> you </span><span class="NormalTextRun SCXW66765914 BCX0">don’t</span><span class="NormalTextRun SCXW66765914 BCX0"> need to use the terminal or remember commands. You get immediate, interactive AI responses, with buttons and menus to manage everything.</span></span><span class="EOP SCXW66765914 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b60b847 elementor-widget elementor-widget-heading" data-id="b60b847" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Ollama vs. LM Studio: Tool Comparison </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-2d19913 elementor-widget elementor-widget-text-editor" data-id="2d19913" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW228077632 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW228077632 BCX0">Both </span><span class="NormalTextRun SpellingErrorV2Themed SCXW228077632 BCX0">Ollama</span><span class="NormalTextRun SCXW228077632 BCX0"> and LM Studio let you run LLMs locally, but they cater to slightly different audiences and use-cases. </span><span class="NormalTextRun SCXW228077632 BCX0">Here’s</span><span class="NormalTextRun SCXW228077632 BCX0"> a comparison of key aspects to help you understand their differences:</span></span><span class="EOP SCXW228077632 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-872db63 elementor-widget elementor-widget-text-editor" data-id="872db63" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"><b>Interface &amp; Ease of Use</b>:</span></span> <span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">LM Studio</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"> provides a polished </span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">graphical user interface</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">, which makes it extremely approachable for beginners. </span><span class="NormalTextRun SCXW183865909 BCX0">It’s</span><span class="NormalTextRun SCXW183865909 BCX0"> point-and-click with an integrated chat window, so no technical knowledge is </span><span class="NormalTextRun SCXW183865909 BCX0">required</span><span class="NormalTextRun SCXW183865909 BCX0"> to get </span><span class="NormalTextRun SCXW183865909 BCX0">started.</span> </span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW183865909 BCX0">Ollama</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">, on the other hand, is a </span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0">command-line interface (CLI)</span></span><span class="TextRun SCXW183865909 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW183865909 BCX0"> tool (with an optional REST API). It offers a lot of power and </span><span class="NormalTextRun SCXW183865909 BCX0">flexibility but</span><span class="NormalTextRun SCXW183865909 BCX0"> does require comfort with the terminal to use </span><span class="NormalTextRun SCXW183865909 BCX0">effectively.</span><span class="NormalTextRun SCXW183865909 BCX0"> Beginners might find </span><span class="NormalTextRun SpellingErrorV2Themed SCXW183865909 BCX0">Ollama’s</span><span class="NormalTextRun SCXW183865909 BCX0"> learning curve steeper, </span><span class="NormalTextRun SCXW183865909 BCX0">whereas</span><span class="NormalTextRun SCXW183865909 BCX0"> LM Studio feels more plug-and-play.</span></span><span class="EOP SCXW183865909 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-4f4ccfd elementor-widget elementor-widget-text-editor" data-id="4f4ccfd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul>
<li style="list-style-type: none;">
<ul>
<li><span class="TextRun SCXW89912573 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW89912573 BCX0"><b>Supported Models:</b></span></span><span class="TextRun SCXW89912573 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW89912573 BCX0"> Both tools support a wide range of open-source LLMs. LM Studio can load any model in GGUF format (the standard for llama.cpp), meaning models like Llama 2 (7B, 13B, 70B), Mistral, Vicuna, Alpaca, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW89912573 BCX0">CodeLlama</span><span class="NormalTextRun SCXW89912573 BCX0">, etc., </span><span class="NormalTextRun SCXW89912573 BCX0">as long as</span><span class="NormalTextRun SCXW89912573 BCX0"> you have the hardware for them&nbsp;</span></span><span class="EOP SCXW89912573 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}">&nbsp;</span></li>
</ul>
</li>
</ul>						</div>
				</div>
				<div class="elementor-element elementor-element-8a420f5 elementor-widget elementor-widget-text-editor" data-id="8a420f5" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul>
<li style="list-style-type: none;">
<ul>
<li><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>Use Cases Suited</b>:</span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"> Because of the above differences, </span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>LM Studio is excellent for users who want a personal ChatGPT-like assistant on their PC with minimal setup</b></span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0">. </span><span class="NormalTextRun SCXW72859417 BCX0">It’s</span><span class="NormalTextRun SCXW72859417 BCX0"> great for interactive Q&amp;A, brainstorming, or casual use – you launch it when you need it, type queries, get answers. </span></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">Ollama</span><span class="NormalTextRun SCXW72859417 BCX0"> is ideal for developers or those who want to incorporate LLMs into projects or workflows</span></b></span><span class="TextRun SCXW72859417 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW72859417 BCX0"><b>.</b> If you plan to experiment with prompts in scripts, fine-tune model </span><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">behaviors</span><span class="NormalTextRun SCXW72859417 BCX0">, or build an app (like a chatbot, a coding assistant integration, etc.) that calls a local model, </span><span class="NormalTextRun SpellingErrorV2Themed SCXW72859417 BCX0">Ollama’s</span><span class="NormalTextRun SCXW72859417 BCX0"> CLI and API give you that flexibility.</span></span><span class="EOP SCXW72859417 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}">&nbsp;</span></li>
</ul>
</li>
</ul>						</div>
				</div>
				<div class="elementor-element elementor-element-b4240ce elementor-widget elementor-widget-heading" data-id="b4240ce" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Conclusion and Recommendations </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-a417a13 elementor-widget elementor-widget-text-editor" data-id="a417a13" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0">Deploying LLMs locally has </span><span class="NormalTextRun SCXW229049291 BCX0">opened up</span><span class="NormalTextRun SCXW229049291 BCX0"> a world of possibilities for developers and enthusiasts. </span><span class="NormalTextRun SCXW229049291 BCX0">We’ve</span><span class="NormalTextRun SCXW229049291 BCX0"> discussed </span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SpellingErrorV2Themed SCXW229049291 BCX0"><b>Ollama</b></span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"> and </span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"><b>LM Studio</b></span></span><span class="TextRun SCXW229049291 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW229049291 BCX0"><b> </b>– two excellent tools that make local AI accessible. To recap some guidance on choosing between them:</span></span><span class="EOP SCXW229049291 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-93b9fb0 elementor-widget elementor-widget-text-editor" data-id="93b9fb0" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"><b>Choose LM Studio</b></span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"> if you want a </span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"><b>plug-and-play AI chat experience</b></span></span><span class="TextRun SCXW193410475 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW193410475 BCX0"> with a friendly GUI. </span><span class="NormalTextRun SCXW193410475 BCX0">It’s</span><span class="NormalTextRun SCXW193410475 BCX0"> perfect for beginners or those who prefer not to tinker with command lines. You get quick setup, easy model downloads, and a nice chat interface for </span><span class="NormalTextRun SCXW193410475 BCX0">interactions.</span><span class="NormalTextRun SCXW193410475 BCX0"> This might be best for someone who just wants an “offline ChatGPT” for personal use, note-taking, or idea generation without fussing over configurations. </span><span class="NormalTextRun SCXW193410475 BCX0">It’s</span><span class="NormalTextRun SCXW193410475 BCX0"> also a convenient way to demo LLM capabilities to non-technical users (since it feels like a normal app).</span></span><span class="EOP SCXW193410475 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-dcecadd elementor-widget elementor-widget-text-editor" data-id="dcecadd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<ul><li style="list-style-type: none;"><ul><li><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><b><span class="NormalTextRun SCXW242147822 BCX0">Choose </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Ollama</span></b></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0"><b> </b>if you want </span></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0">more<b> control, automation, or integration</b></span></span><span class="TextRun SCXW242147822 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW242147822 BCX0"><b>.</b> Developers and power users will appreciate its flexibility – you can script it, run it headless on a server, integrate the local LLM into your own apps via the API, and fine-tune model </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">behavior</span><span class="NormalTextRun SCXW242147822 BCX0"> with </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Modelfiles</span><span class="NormalTextRun SCXW242147822 BCX0"> . If </span><span class="NormalTextRun SCXW242147822 BCX0">you’re</span><span class="NormalTextRun SCXW242147822 BCX0"> comfortable with a terminal and want to customize how the AI works (beyond what a GUI allows), </span><span class="NormalTextRun SpellingErrorV2Themed SCXW242147822 BCX0">Ollama</span><span class="NormalTextRun SCXW242147822 BCX0"> is a better fit. </span><span class="NormalTextRun SCXW242147822 BCX0">It’s</span><span class="NormalTextRun SCXW242147822 BCX0"> also lightweight if you intend to run background AI services continuously.</span></span><span class="EOP SCXW242147822 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:220,&quot;335559739&quot;:220}"> </span></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-cd8dc50 elementor-widget elementor-widget-text-editor" data-id="cd8dc50" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0">Finally, remember that the </span></span><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0">LLM itself</span></span><span class="TextRun SCXW16460031 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW16460031 BCX0"> (the model you choose) is as important as the tool. Spend time finding a model that suits your task – whether </span><span class="NormalTextRun SCXW16460031 BCX0">it’s</span><span class="NormalTextRun SCXW16460031 BCX0"> a concise summarizer or a creative storyteller – and fits your hardware. Both </span><span class="NormalTextRun SpellingErrorV2Themed SCXW16460031 BCX0">Ollama</span><span class="NormalTextRun SCXW16460031 BCX0"> and LM Studio make it easy to swap models, so </span><span class="NormalTextRun SCXW16460031 BCX0">you’re</span><span class="NormalTextRun SCXW16460031 BCX0"> not locked in. The ecosystem of open-source models is growing rapidly, which means running a powerful AI on your own device is only getting easier and more common.</span></span><span class="EOP SCXW16460031 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-5a3b149 elementor-widget elementor-widget-text-editor" data-id="5a3b149" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0">In summary</span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0">, deploying LLMs locally with these tools gives you the best of both worlds: AI capabilities </span><span class="NormalTextRun SCXW154420196 BCX0">similar to</span><span class="NormalTextRun SCXW154420196 BCX0"> cloud services, but with </span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0"><b>privacy, control, and zero ongoing cost</b></span></span><span class="TextRun SCXW154420196 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="none"><span class="NormalTextRun SCXW154420196 BCX0"><b>.</b> Whether you go with a command-line power tool like </span><span class="NormalTextRun SpellingErrorV2Themed SCXW154420196 BCX0">Ollama</span><span class="NormalTextRun SCXW154420196 BCX0"> or a user-friendly app like LM Studio, </span><span class="NormalTextRun SCXW154420196 BCX0">you’ll</span><span class="NormalTextRun SCXW154420196 BCX0"> be joining the </span><span class="NormalTextRun SCXW154420196 BCX0">cutting edge</span><span class="NormalTextRun SCXW154420196 BCX0"> of local AI development. Happy </span><span class="NormalTextRun ContextualSpellingAndGrammarErrorV2Themed SCXW154420196 BCX0">experimenting, and</span><span class="NormalTextRun SCXW154420196 BCX0"> enjoy your new personal AI running right on your machine!</span></span><span class="EOP SCXW154420196 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}">&nbsp;</span></p>						</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/deploying-llms-locally-a-guide-to-ollama-and-lm-studio/">Deploying LLMs Locally: A Guide to Ollama and LM Studio</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7692</post-id>	</item>
		<item>
		<title>What are AI Agents and how can they help your company</title>
		<link>https://inero-software.com/what-are-ai-agents-and-how-can-they-help-your-company/</link>
		
		<dc:creator><![CDATA[Marta Kuprasz]]></dc:creator>
		<pubDate>Fri, 28 Feb 2025 09:51:15 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[SOLUTIONS]]></category>
		<category><![CDATA[AI Agents]]></category>
		<category><![CDATA[AI Algorithms]]></category>
		<category><![CDATA[AI assistants]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[DigitalTransformation]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[NLP]]></category>
		<category><![CDATA[virtual assistants]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7498</guid>

					<description><![CDATA[<p>In this article, we will take a closer look at AI Agents, which can provide valuable support, particularly in back-office processes.</p>
<p>Artykuł <a href="https://inero-software.com/what-are-ai-agents-and-how-can-they-help-your-company/">What are AI Agents and how can they help your company</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7498" class="elementor elementor-7498" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-2ddef76 e-flex e-con-boxed e-con e-parent" data-id="2ddef76" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-e2ded1d elementor-widget elementor-widget-html" data-id="e2ded1d" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-826db69 elementor-widget elementor-widget-text-editor" data-id="826db69" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4>The term <strong>artificial intelligence</strong> has been prominently featured in numerous publications as a solution to challenges related to efficiency, organization, and creativity. Many companies are following this trend, striving to incorporate AI-driven solutions into their offerings. These efforts take various forms. In this article, we will take a closer look at <strong>AI Agents</strong>, which can provide valuable support, particularly in back-office processes.</h4>						</div>
				</div>
				<div class="elementor-element elementor-element-76b8aa2 elementor-widget elementor-widget-text-editor" data-id="76b8aa2" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>For some time now, we have been observing a significant rise in the popularity of terms related to the use of artificial intelligence. So, let&#8217;s start from the beginning.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-480bc97 elementor-widget elementor-widget-heading" data-id="480bc97" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">What is "Artificial Intelligence"?</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-3196501 elementor-widget elementor-widget-text-editor" data-id="3196501" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The term &#8220;artificial intelligence&#8221; encompasses Large Language Models (LLMs), natural language processing (NLP) systems, machine learning algorithms, neural networks, and generative AI models.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-dec9054 elementor-widget elementor-widget-text-editor" data-id="dec9054" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>LLMs, such as<a href="https://chatgpt.com/"> ChatGPT from OpenAI</a> or <a href="https://gemini.google.com/app?hl=pl">Gemini from Google</a>, are models trained on vast datasets that can analyze, process, and generate text in a way that mimics human reasoning. They are used in various applications, ranging from chatbots and voice assistants to advanced systems supporting business analysis and process automation in companies.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b208bcb elementor-widget elementor-widget-text-editor" data-id="b208bcb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Artificial intelligence is not limited to text processing. Modern models can also analyze images, audio, video, and numerical data, making them highly versatile tools in business. AI enables not only the automation of repetitive tasks but also the detection of patterns in large datasets, trend forecasting, and support for strategic decision-making in companies.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-3ff848a elementor-widget elementor-widget-heading" data-id="3ff848a" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Who are AI agents?</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-ab7f584 elementor-widget elementor-widget-text-editor" data-id="ab7f584" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>&#8220;AI agents&#8221; are intelligent systems based on machine learning algorithms, natural language processing (NLP) models, and Large Language Models (LLMs). Their purpose is to automate processes, support decision-making, and interact with users in a natural and context-aware manner.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-ceaa7cb elementor-widget elementor-widget-text-editor" data-id="ceaa7cb" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>This means that virtual assistants are based on well-known and widely used LLMs such as ChatGPT, Gemini, Claude, Mistral, or DeepSeek, which can generate coherent responses, analyze texts, and adapt to the context of a conversation.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-fa301d3 elementor-widget elementor-widget-text-editor" data-id="fa301d3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>However, AI agents differ from language models in that they are designed to perform specific tasks autonomously. In practice, this means they are equipped with additional modules that enable them to gather information, process data in real-time, and make decisions based on business rules.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-8bb76b7 elementor-widget elementor-widget-text-editor" data-id="8bb76b7" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Unlike traditional chatbots, AI agents not only answer questions but can also handle complex processes, integrate with enterprise systems, and learn from user interactions. As a result, they are used in various areas, from administrative support and document analysis to the automation of operational processes in enterprises.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-41b2d06 elementor-widget elementor-widget-heading" data-id="41b2d06" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default"><a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/">Also read: Meet Your Personal AI Agent – A Case Study for a Freight Company</a></h4>		</div>
				</div>
				<div class="elementor-element elementor-element-5716208 elementor-widget__width-initial elementor-widget elementor-widget-video" data-id="5716208" data-element_type="widget" data-settings="{&quot;youtube_url&quot;:&quot;https:\/\/youtu.be\/B4VxxjWYzDM&quot;,&quot;autoplay&quot;:&quot;yes&quot;,&quot;play_on_mobile&quot;:&quot;yes&quot;,&quot;video_type&quot;:&quot;youtube&quot;,&quot;controls&quot;:&quot;yes&quot;}" data-widget_type="video.default">
				<div class="elementor-widget-container">
					<div class="elementor-wrapper elementor-open-inline">
			<div class="elementor-video"></div>		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-7f6ac96 elementor-widget elementor-widget-text-editor" data-id="7f6ac96" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The operation of AI agents is based on several key components:</p><ul><li style="list-style-type: none;"><ul><li><strong>Communication interface</strong> – allows the agent to interact with users through text, speech, or other data formats.</li><li><strong>Decision engine</strong> – based on AI models and business rules, it enables situation analysis and the selection of optimal actions.</li><li><strong>Integration with external systems</strong> – AI agents often operate in conjunction with databases, business applications (ERP, CRM), or cloud services, allowing them to access up-to-date information.</li><li><strong>Process automation</strong> – they can perform specific tasks, such as generating reports, processing requests, sending notifications, or initiating predefined processes in IT systems.</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-6a3e88a elementor-widget elementor-widget-heading" data-id="6a3e88a" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">What are the types of AI agents?</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-8a579f3 elementor-widget elementor-widget-text-editor" data-id="8a579f3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>AI agents may take various forms depending on their application and level of autonomy. Leveraging advanced artificial intelligence models, they can assist users in a wide range of activities, from customer support to data analysis and business process management.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-a062ae8 elementor-widget elementor-widget-text-editor" data-id="a062ae8" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>We can distinguish several main types of AI agents:</p><ul><li style="list-style-type: none;"><ul><li><strong>Conversational agents</strong> – include chatbots and voicebots that interact with users through text or speech. They can answer questions, handle customer inquiries, and support sales processes.</li><li><strong>Analytical agents</strong> – specialize in processing and interpreting data. They use machine learning algorithms to analyze trends, detect anomalies, and generate reports.</li><li><strong>Operational agents</strong> – automate business tasks by integrating with enterprise systems. They can manage documentation, process documents, or coordinate activities within corporate processes.</li><li><strong>Autonomous agents</strong> – operate independently, making decisions based on collected data and predefined business rules. They are used in areas such as logistics, resource management, and dynamic operational planning.</li><li><strong>Decision-support agents</strong> – provide recommendations based on advanced data analysis, helping managers and specialists make strategic decisions.</li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-b1010f3 elementor-widget elementor-widget-text-editor" data-id="b1010f3" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Each of these types can operate independently or collaborate with other systems, creating a complex AI-driven environment. In the following sections, we will explore specific applications of AI agents and their impact on the operational efficiency of businesses.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-4104376 elementor-widget elementor-widget-heading" data-id="4104376" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Cloud or on-premise solution – how can an AI agent be implemented in a corporate environment?</h3>		</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-2c27fe1 e-flex e-con-boxed e-con e-parent" data-id="2c27fe1" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-7dc1486 elementor-widget elementor-widget-text-editor" data-id="7dc1486" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>Implementing an AI agent in an organization requires selecting the appropriate deployment model that best meets business, technical, and regulatory requirements. Companies can choose between a cloud-based solution (SaaS) or an on-premise deployment, depending on their needs for flexibility, security, and integration with existing systems.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-5af4758 elementor-widget elementor-widget-text-editor" data-id="5af4758" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The choice of the appropriate model depends on various factors, which are presented in the table below.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d9f8cc1 elementor-widget elementor-widget-html" data-id="d9f8cc1" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Comparison: SaaS vs On-Premise</title>
    <link href="https://fonts.googleapis.com/css2?family=Roboto:wght@300&amp;display=swap" rel="stylesheet">
    <style>
        .table-container {
            width: 100%;
            overflow-x: auto; /* Enables horizontal scrolling */
        }
        .custom-table {
            width: 100%;
            min-width: 600px; /* Ensures the table is not too small */
            border-collapse: collapse;
            font-family: 'Roboto', sans-serif;
            font-size: 14px;
            font-weight: 300;
            color: #1C244B;
        }
        .custom-table th, .custom-table td {
            border: 1px solid #000;
            padding: 10px;
            text-align: justify;
        }
        .custom-table th {
            background: #ddd;
            font-weight: bold;
            text-align: center;
        }
        .custom-table tr:nth-child(even) {
            background: #f9f9f9;
        }

        /* Responsive adjustments for smaller screens */
        @media screen and (max-width: 768px) {
            .custom-table th, .custom-table td {
                padding: 8px; /* Reduces padding on small screens */
                font-size: 12px; /* Reduces text size */
            }
        }
    </style>
</head>
<body>

<div class="table-container">
    <table class="custom-table">
        <tr>
            <th>Criterion</th>
            <th>SaaS (Cloud)</th>
            <th>On-Premise (Local)</th>
        </tr>
        <tr>
            <td>Deployment model</td>
            <td>Cloud-based (AWS, Azure, Google Cloud)</td>
            <td>Operates on the company’s own infrastructure</td>
        </tr>
        <tr>
            <td>Infrastructure</td>
            <td>Cloud service provider</td>
            <td>Local servers</td>
        </tr>
        <tr>
            <td>Initial costs</td>
            <td>Low</td>
            <td>High</td>
        </tr>
        <tr>
            <td>Operational costs</td>
            <td>Subscription-based</td>
            <td>Fixed maintenance and energy costs</td>
        </tr>
        <tr>
            <td>Scalability</td>
            <td>Very high</td>
            <td>Limited (dependent on hardware)</td>
        </tr>
        <tr>
            <td>Data security</td>
            <td>Limited (processed outside the company)</td>
            <td>High (full control over data)</td>
        </tr>
        <tr>
            <td>Regulatory compliance</td>
            <td>May require additional agreements and certifications</td>
            <td>Easier to meet regulatory requirements</td>
        </tr>
        <tr>
            <td>Ease of implementation</td>
            <td>Easy and fast</td>
            <td>Requires hardware purchase and setup</td>
        </tr>
        <tr>
            <td>Updates and maintenance</td>
            <td>Automatic, provided by the vendor</td>
            <td>Self-managed updates and maintenance</td>
        </tr>
        <tr>
            <td>Integration with enterprise systems</td>
            <td>Strong API support and pre-built integrations</td>
            <td>Full control but may require additional integration</td>
        </tr>
    </table>
</div>

</body>
</html>
		</div>
				</div>
				<div class="elementor-element elementor-element-0ed2afd elementor-widget elementor-widget-text-editor" data-id="0ed2afd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The choice of the appropriate deployment model—cloud-based or on-premise—depends on the company&#8217;s specific requirements regarding security, costs, and integration with existing systems. Regardless of the chosen strategy, AI agents can significantly enhance operational efficiency and allow employees to focus on tasks that require creativity and strategic thinking.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-b286bc4 elementor-widget elementor-widget-text-editor" data-id="b286bc4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The development of AI technology is undoubtedly one of the strongest technological trends in recent years. Therefore, it is worth considering now how AI agents can support your company&#8217;s growth and become a key element of its digital transformation.</p>						</div>
				</div>
					</div>
				</div>
		<div class="elementor-element elementor-element-86316a7 e-flex e-con-boxed e-con e-parent" data-id="86316a7" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-42ec473 elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="42ec473" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<a class="elementor-cta" href="https://inero-software.com/contact-us/">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/02/cta-AI2-1030x579.png);" role="img" aria-label="cta AI2"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We will create an AI Agent for your company.					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Contact us to learn how we can help you implement a new AI-based solution.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<span class="elementor-cta__button elementor-button elementor-size-">
						Contact us 					</span>
					</div>
							</div>
						</a>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/what-are-ai-agents-and-how-can-they-help-your-company/">What are AI Agents and how can they help your company</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7498</post-id>	</item>
		<item>
		<title>Meet Your Personal AI Agent: A Case Study for a Freight Forwarding Company</title>
		<link>https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/</link>
		
		<dc:creator><![CDATA[Marta Kuprasz]]></dc:creator>
		<pubDate>Fri, 21 Feb 2025 11:27:19 +0000</pubDate>
				<category><![CDATA[AI]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[SOLUTIONS]]></category>
		<category><![CDATA[AGENT]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[BusinessProcessesOptimization]]></category>
		<category><![CDATA[Case study]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=7341</guid>

					<description><![CDATA[<p>AI-driven tools are becoming increasingly prevalent across various industries, streamlining processes from simple graphic design and translations to advanced document, email, and database analysis. In this article, we will present a practical business application of an AI assistant in action. AI Agents have a wide range of applications, and their&#8230;</p>
<p>Artykuł <a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/">Meet Your Personal AI Agent: A Case Study for a Freight Forwarding Company</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="7341" class="elementor elementor-7341" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-c1e7efe e-flex e-con-boxed e-con e-parent" data-id="c1e7efe" data-element_type="container">
					<div class="e-con-inner">
				<div class="elementor-element elementor-element-a609c1d elementor-widget elementor-widget-html" data-id="a609c1d" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-ff753e4 elementor-widget elementor-widget-text-editor" data-id="ff753e4" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h5>AI-driven tools are becoming increasingly prevalent across various industries, streamlining processes from simple graphic design and translations to advanced document, email, and database analysis. In this article, we will present a practical business application of an AI assistant in action.</h5>						</div>
				</div>
				<div class="elementor-element elementor-element-ee85ecd elementor-widget elementor-widget-text-editor" data-id="ee85ecd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>AI Agents have a wide range of applications, and their full potential is still being discovered. The main advantages of AI-powered assistants include:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-06bf962 elementor-widget elementor-widget-text-editor" data-id="06bf962" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h5 data-start="49" data-end="91"><strong data-start="54" data-end="89">1. Automating Routine Processes</strong></h5><p data-start="92" data-end="294">AI agents can handle repetitive tasks such as customer inquiries, document analysis, and data management. By automating these processes, businesses can reduce operational costs and improve efficiency.</p><p data-start="92" data-end="294"> </p><h5 data-start="296" data-end="344"><strong data-start="301" data-end="342">2. Personalized Customer Interactions</strong></h5><p data-start="345" data-end="506">By analyzing data, AI agents can provide personalized recommendations and tailored offers, enhancing customer engagement and improving overall user experience.</p><p data-start="345" data-end="506"> </p><h5 data-start="508" data-end="544"><strong data-start="513" data-end="542">3. Speed and Availability</strong></h5><p data-start="545" data-end="739">AI operates 24/7, delivering instant responses and real-time support. This is particularly valuable in industries that require quick reaction times, such as e-commerce, finance, and logistics.</p><p data-start="545" data-end="739"> </p><h5 data-start="741" data-end="777"><strong data-start="746" data-end="775">4. Advanced Data Analysis</strong></h5><p data-start="778" data-end="931">AI-powered agents can process vast amounts of data in a short time, identifying patterns and correlations that support better business decision-making.</p><p data-start="778" data-end="931"> </p><h5 data-start="933" data-end="983"><strong data-start="938" data-end="981">5. Optimizing Decision-Making Processes</strong></h5><p data-start="984" data-end="1145">With predictive modeling, AI assists in demand forecasting, risk management, and supply chain optimization, helping organizations make more informed decisions.</p><p data-start="984" data-end="1145"> </p><h5 data-start="1147" data-end="1203"><strong data-start="1152" data-end="1201">6. Seamless Integration with Existing Systems</strong></h5><p data-start="1204" data-end="1369">Modern AI solutions can be easily integrated into existing ERP, CRM, and analytics platforms, enhancing their capabilities and improving overall system efficiency.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-11dd40f elementor-widget elementor-widget-heading" data-id="11dd40f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">A Practical Example of AI Agent Use in the Transport Industry</h3>		</div>
				</div>
				<div class="elementor-element elementor-element-49baa37 elementor-widget__width-initial elementor-widget elementor-widget-video" data-id="49baa37" data-element_type="widget" data-settings="{&quot;youtube_url&quot;:&quot;https:\/\/youtu.be\/B4VxxjWYzDM&quot;,&quot;video_type&quot;:&quot;youtube&quot;,&quot;controls&quot;:&quot;yes&quot;}" data-widget_type="video.default">
				<div class="elementor-widget-container">
					<div class="elementor-wrapper elementor-open-inline">
			<div class="elementor-video"></div>		</div>
				</div>
				</div>
				<div class="elementor-element elementor-element-a9b9980 elementor-widget elementor-widget-text-editor" data-id="a9b9980" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>AI agents can be applied across various industries, including banking, sales, and human resource management. In this text, we will focus on a freight forwarding company that handles anywhere from a few to dozens of shipments daily.</p><p> </p><p> </p><p>Freight forwarders deal with constant communication and the verification of numerous documents. Each of these tasks takes time—a resource that is often in short supply—making errors more likely when the workload is high.</p><p> </p><p> </p><p>How can time management be improved? By automating repetitive and predictable tasks. This is where an AI Agent comes in. Here’s an example of an AI assistant we developed, powered by <a href="https://gemini.google.com/app?hl=pl">Google’s Large Language Model, Gemini.</a></p><p> </p><p>One possible application is the following scenario:</p><p> </p><p> </p><p>A freight forwarder receives an email that should include an insurance policy along with proof of payment. The AI Agent automatically, without needing to be prompted, checks whether the email contains the required attachments. If they are included, it proceeds to verify the following details:</p>						</div>
				</div>
				<div class="elementor-element elementor-element-d2f738d elementor-widget elementor-widget-text-editor" data-id="d2f738d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><strong>In the Insurance Policy:</strong></p><ul><li style="list-style-type: none;"><ul><li style="list-style-type: none;"><ul><li>Policy number</li><li>Insurance period and whether it is currently valid</li><li>Insured party details, including tax identification number and address</li><li>Bank account number for premium payment</li></ul></li></ul></li></ul><p><strong>In the Payment Confirmation:</strong></p><ul><li style="list-style-type: none;"><ul><li style="list-style-type: none;"><ul><li>Payment reference</li><li>Amount</li><li>Bank account number</li><li>Payment date</li><li>Whether the transfer corresponds to the submitted policy (e.g., based on the reference, account number)</li></ul></li></ul></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-3edb3a5 elementor-widget elementor-widget-image" data-id="3edb3a5" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" width="1030" height="366" src="https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-1030x366.png" class="attachment-large size-large wp-image-7334" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-1030x366.png 1030w, https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-300x107.png 300w, https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-768x273.png 768w, https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-1536x546.png 1536w, https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-844x300.png 844w, https://inero-software.com/wp-content/uploads/2025/02/analysis-take1.png 1722w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7334" data-permalink="https://inero-software.com/pl/poznaj-swojego-osobistego-agenta-ai-case-study-dla-firmy-spedycyjnej/analysis-take1/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/analysis-take1.png" data-orig-size="1722,612" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="analysis-take1" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-300x107.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/analysis-take1-1030x366.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-a59477b elementor-widget elementor-widget-text-editor" data-id="a59477b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>The AI Agent then transfers the extracted data into a designated Excel file, which is continuously updated. The data file can be formatted accordingly, for example, by highlighting entries in red where the insurance policy is invalid or the payment has not been verified. </p>						</div>
				</div>
				<div class="elementor-element elementor-element-ba91f9f elementor-widget elementor-widget-image" data-id="ba91f9f" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" width="1030" height="402" src="https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-1030x402.png" class="attachment-large size-large wp-image-7335" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-1030x402.png 1030w, https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-300x117.png 300w, https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-768x299.png 768w, https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-1536x599.png 1536w, https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-770x300.png 770w, https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751.png 1539w" sizes="(max-width: 1030px) 100vw, 1030px" data-attachment-id="7335" data-permalink="https://inero-software.com/pl/poznaj-swojego-osobistego-agenta-ai-case-study-dla-firmy-spedycyjnej/zrzut-ekranu-2025-02-21-112751/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751.png" data-orig-size="1539,600" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Zrzut ekranu 2025-02-21 112751" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-300x117.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/Zrzut-ekranu-2025-02-21-112751-1030x402.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-5abc3ea elementor-widget elementor-widget-text-editor" data-id="5abc3ea" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p>In this simple way, instead of searching through their inbox for the right emails, the freight forwarder can check the Excel file to see if the documents have been received from a specific sender and whether they are correct. This saves a significant amount of time and ensures data accuracy.</p><p> </p><p>There are many ways to further develop our AI Assistant. It can be integrated with other tools, such as Slack or other communication platforms, to send notifications about missing documents or generate automated email responses. An AI-powered agent can be tailored to the specific needs of a company, a department, or even an individual role.</p>						</div>
				</div>
				<div class="elementor-element elementor-element-aee5ea0 elementor-cta--skin-cover elementor-widget__width-inherit elementor-hidden-mobile elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="aee5ea0" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<div class="elementor-cta">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2025/02/cta-AI2-1030x579.png);" role="img" aria-label="cta AI2"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Do you want to explore the possibilities of AI Agents?​					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						Schedule a meeting. We’d be happy to discuss the possibilities.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<a class="elementor-cta__button elementor-button elementor-size-" href="https://calendar.google.com/calendar/u/0/appointments/schedules/AcZssZ3e3C_1YeBkt1uCr_qfOnG_N298UgLFwORcSTXigrPfOk0ls3ok-Uw_dSeGCoLdtYsN13GMm-n-">
						SCHEDULE A MEETING					</a>
					</div>
							</div>
						</div>
				</div>
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/meet-your-personal-ai-agent-a-case-study-for-a-freight-forwarding-company/">Meet Your Personal AI Agent: A Case Study for a Freight Forwarding Company</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7341</post-id>	</item>
		<item>
		<title>Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</title>
		<link>https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/</link>
		
		<dc:creator><![CDATA[Martyna Mul]]></dc:creator>
		<pubDate>Tue, 04 Feb 2025 10:33:15 +0000</pubDate>
				<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[DeepEval]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<category><![CDATA[RAG]]></category>
		<category><![CDATA[Retrieval-Augmented Generation]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=6902</guid>

					<description><![CDATA[<p>This post explores how DeepEval helps systematically assess the effectiveness of both retrieval and generation components, ensuring more reliable machine-generated insights. </p>
<p>Artykuł <a href="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/">Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[		<div data-elementor-type="wp-post" data-elementor-id="6902" class="elementor elementor-6902" data-elementor-post-type="post">
				<div class="elementor-element elementor-element-a77f132 e-flex e-con-boxed e-con e-parent" data-id="a77f132" data-element_type="container">
					<div class="e-con-inner">
		<div class="elementor-element elementor-element-eedef0f e-con-full e-flex e-con e-child" data-id="eedef0f" data-element_type="container">
				</div>
		<div class="elementor-element elementor-element-8bb2c58 e-con-full e-flex e-con e-child" data-id="8bb2c58" data-element_type="container">
		<div class="elementor-element elementor-element-cac0d92 e-con-full e-flex e-con e-child" data-id="cac0d92" data-element_type="container">
				<div class="elementor-element elementor-element-f3a0ecb elementor-widget elementor-widget-html" data-id="f3a0ecb" data-element_type="widget" data-widget_type="html.default">
				<div class="elementor-widget-container">
			 		</div>
				</div>
				<div class="elementor-element elementor-element-33c698c elementor-widget elementor-widget-text-editor" data-id="33c698c" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<h4><span class="TextRun SCXW184211874 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW184211874 BCX0">Retrieval-Augmented Generation (RAG) models are transforming the capabilities of intelligent assistants, enabling more </span><span class="NormalTextRun SCXW184211874 BCX0">accurate</span><span class="NormalTextRun SCXW184211874 BCX0"> and context-aware responses to user queries. Unlike traditional large language models (LLMs), RAG-based systems integrate two essential components: a retrieval mechanism that fetches relevant documents and a generative model that synthesizes responses based on real-time </span><span class="NormalTextRun SCXW184211874 BCX0">data. This post explores how </span><span class="NormalTextRun SCXW184211874 BCX0">DeepEval</span><span class="NormalTextRun SCXW184211874 BCX0"> helps systematically assess the effectiveness of both retrieval and generation components, ensuring more reliable machine-generated insights.</span></span><span class="EOP TrackedChange SCXW184211874 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></h4>						</div>
				</div>
				<div class="elementor-element elementor-element-6e6ea96 elementor-widget elementor-widget-text-editor" data-id="6e6ea96" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">While RAG-enhanced virtual assistants significantly improve answer relevance, evaluating their performance remains a challenge. Since these models rely on both retrieval and text generation, a weak document-fetching step can lead to misleading or incorrect responses, even if the underlying LLM is highly advanced.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We’ll</span> <span data-contrast="auto">demonstrate</span><span data-contrast="auto"> this process using our custom AI-driven assistant</span><span data-contrast="auto">, designed to answer complex queries about </span><span data-contrast="auto">maritime economy statistics</span><span data-contrast="auto">, </span><span data-contrast="auto">showcasing</span><span data-contrast="auto"> how </span><span data-contrast="auto">LLM-powered knowledge retrieval enhances data-driven decision-making.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-d0eedd1 elementor-widget elementor-widget-heading" data-id="d0eedd1" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">SeaStat - Our AI Assistant </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-caacdce elementor-widget elementor-widget-text-editor" data-id="caacdce" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">A great example</span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0"> that we can use to discuss this topic is the </span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">SeaStat</span></span></span> <span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0">AI Assistant</span></span></span><span class="TrackChangeTextInsertion TrackedChange SCXW210561514 BCX0"><span class="TextRun SCXW210561514 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW210561514 BCX0"> developed by us as part of the Incone60 Green Project (https://www.incone60.eu/). The goal of the project is to improve the competitiveness and sustainable development of small seaports in the South Baltic region.</span></span></span><span class="EOP SCXW210561514 BCX0" data-ccp-props="{&quot;335551550&quot;:6,&quot;335551620&quot;:6}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-42941ab elementor-widget elementor-widget-image" data-id="42941ab" data-element_type="widget" data-widget_type="image.default">
				<div class="elementor-widget-container">
													<img loading="lazy" decoding="async" data-attachment-id="6904" data-permalink="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/seastat/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" data-orig-size="517,587" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SeaStat" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" tabindex="0" role="button" width="517" height="587" src="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" class="attachment-large size-large wp-image-6904" alt="" srcset="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png 517w, https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png 264w" sizes="(max-width: 517px) 100vw, 517px" data-attachment-id="6904" data-permalink="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/seastat/" data-orig-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" data-orig-size="517,587" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SeaStat" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat-264x300.png" data-large-file="https://inero-software.com/wp-content/uploads/2025/02/SeaStat.png" role="button" />													</div>
				</div>
				<div class="elementor-element elementor-element-bf0da8d elementor-widget elementor-widget-text-editor" data-id="bf0da8d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentStart CommentHighlightPipeRestRefresh CommentHighlightRest SCXW10433028 BCX0">Duri</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">ng Incone60 Gren Project w</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">e have developed an AI assistant that answers questions about maritime economy data, providing instant access to structured maritime economic insights. This assistant </span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">leverages</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> a </span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">Retrieval-Augmented Generation (RAG)</span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> approach, ensuring that responses are grounded in a structured database covering key aspects such as </span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">seaports, maritime transport</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">,</span><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0"> shipbuilding, passenger traffic, trade, and the fishing industry</span></span><span class="TextRun SCXW10433028 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun CommentHighlightRest SCXW10433028 BCX0">.</span></span><span class="EOP CommentHighlightPipeRestRefresh SCXW10433028 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-85e0e3e elementor-widget elementor-widget-text-editor" data-id="85e0e3e" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Our AI assistant operates within a RAG pipeline that integrates:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">A structured maritime economy database</span></b><span data-contrast="auto">, which includes global and Polish maritime statistics from 2017 to 2020. The data is sourced from publications by Gdynia Maritime University, which aggregate statistics from various government institutes, universities, and port enterprises. The database consists of 50 tables, covering key aspects of maritime transport and is planned to be further extended with additional years. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Dynamic SQL generation</span></b><span data-contrast="auto"> to extract relevant information from the database.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="2" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">A generative LLM</span></b><span data-contrast="auto"> that formulates answers based on the retrieved data.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">Building such an assistant requires several key decisions and parameter optimizations, including:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Selecting the most suitable LLM model and tuning parameters (e.g., temperature).</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">Designing an effective prompt structure.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="3" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">Ensuring the assistant consistently selects the most relevant tables from the dataset.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">This is where </span><b><span data-contrast="auto">automatic testing</span></b><span data-contrast="auto"> becomes crucial. It helps assess system performance, identify weaknesses, and ensure continuous improvement.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-648b32b elementor-widget elementor-widget-heading" data-id="648b32b" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">LLM-as-a-Judge: Automating RAG Model Evaluation  </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-09073bd elementor-widget elementor-widget-text-editor" data-id="09073bd" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Evaluating systems that generate non-deterministic, open-ended text outputs can be challenging because there is often no single &#8220;correct&#8221; answer. While human evaluation is accurate, it can be costly and time-consuming.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><b><span data-contrast="auto">LLM-as-a-Judge</span></b><span data-contrast="auto"> is a method that approximates human evaluation by rating the system&#8217;s output based on custom criteria tailored to your specific application. One such testing framework is </span><b><span data-contrast="auto">DeepEval</span></b><span data-contrast="auto">, which provides a set of metrics designed for both retrieval and generation tasks and allows you to create your own rating criteria. </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-b4e1ef9 elementor-widget elementor-widget-text-editor" data-id="b4e1ef9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Key evaluation metrics are:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">G-Eval</span></b><span data-contrast="auto">: A versatile metric that evaluates LLM output based on custom-defined criteria.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Answer Relevancy</span></b><span data-contrast="auto">: Measures how well the model’s response addresses the user query.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Faithfulness</span></b><span data-contrast="auto">: Assesses how accurately the response aligns with the provided context, helping to limit hallucination in RAG systems.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="4" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><b><span data-contrast="auto">ContextualRecallMetric, ContextualPrecisionMetric, ContextualRelevancyMetric</span></b><span data-contrast="auto">: These metrics are particularly useful for RAG systems, evaluating whether retrieval components return all relevant context while avoiding irrelevant information.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul>						</div>
				</div>
				<div class="elementor-element elementor-element-1c250db elementor-widget elementor-widget-heading" data-id="1c250db" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h3 class="elementor-heading-title elementor-size-default">Step-by-Step RAG Model Testing with DeepEval  </h3>		</div>
				</div>
				<div class="elementor-element elementor-element-e3db2c9 elementor-widget elementor-widget-text-editor" data-id="e3db2c9" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">To ensure the reliability and accuracy of our Retrieval-Augmented Generation (RAG) model, we follow a structured evaluation approach. </span></span></span><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">This process involves dataset creation, response generation, and model evaluation using </span></span></span><span class="TrackedChange SCXW136457389 BCX0"><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">DeepEval</span></span></span><span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">, allowing us to systematically assess the effectiveness of both retrieval and generation components.</span></span> <span class="TextRun SCXW136457389 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW136457389 BCX0">Let’s</span><span class="NormalTextRun SCXW136457389 BCX0"> break down each step in detail.</span></span><span class="EOP SCXW136457389 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-019ed5f elementor-widget elementor-widget-heading" data-id="019ed5f" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">1. Dataset Creation </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-741ed12 elementor-widget elementor-widget-text-editor" data-id="741ed12" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">To evaluate performance, we create a test set consisting of:</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">&#8211; </span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">Realistic questions</span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0"> that users might ask. These can range from simple fact-based queries to more complex, multi-step inquiries that require detailed answers drawn from multiple tables.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">&#8211; </span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0">Expected ground truth responses</span></span><span class="TextRun SCXW40841927 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW40841927 BCX0"> derived directly from the database.</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW40841927 BCX0"><span class="SCXW40841927 BCX0"> </span><br class="SCXW40841927 BCX0" /></span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f4f0b50 elementor-widget elementor-widget-heading" data-id="f4f0b50" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">2. Generating Model Responses </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-587939b elementor-widget elementor-widget-text-editor" data-id="587939b" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW56801091 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW56801091 BCX0">For each test query, the assistant generates an answer based on the relevant data retrieved from the database.</span></span><span class="EOP SCXW56801091 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0deb934 elementor-widget elementor-widget-heading" data-id="0deb934" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">3. Evaluation using DeepEval </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-7c0b685 elementor-widget elementor-widget-text-editor" data-id="7c0b685" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">We are particularly focused on </span><b><span data-contrast="auto">factual correctness</span></b><span data-contrast="auto"> for our assistant, so we use the </span><b><span data-contrast="auto">G-Eval metric</span></b><span data-contrast="auto"> to evaluate this aspect.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">We need to define G-Eval by describing testing criteria, e.g.:</span><span data-ccp-props="{}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-3d98b09 elementor-widget elementor-widget-text-editor" data-id="3d98b09" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">correctness_metric = GEval(    </span> <br /><span data-contrast="auto">    name="Correctness",     </span> <br /><span data-contrast="auto">    evaluation_steps=[  </span> <br /><span data-contrast="auto">        "Assess whether the actual output is accurate in terms of facts compared to the expected output.",      </span> <br /><span data-contrast="auto">        "Penalize missing information."  </span> <br /><span data-contrast="auto">    ],      </span> <br /><span data-contrast="auto">    evaluation_params=[  </span> <br /><span data-contrast="auto">       LLMTestCaseParams.INPUT,   </span> <br /><span data-contrast="auto">       LLMTestCaseParams.ACTUAL_OUTPUT,   </span> <br /><span data-contrast="auto">       LLMTestCaseParams.EXPECTED_OUTPUT  </span> <br /><span data-contrast="auto">    ],    </span> <br /><span data-contrast="auto">)</span> </pre>						</div>
				</div>
				<div class="elementor-element elementor-element-63d0764 elementor-widget elementor-widget-text-editor" data-id="63d0764" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW196212698 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW196212698 BCX0">Additionally, we use several built-in metrics:</span></span><span class="EOP SCXW196212698 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-4344a23 elementor-widget elementor-widget-text-editor" data-id="4344a23" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_precision</span><span class="NormalTextRun SCXW8241585 BCX0"> = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualPrecisionMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_recall = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualRecallMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">contextual_relevancy = </span><span class="NormalTextRun SCXW8241585 BCX0">ContextualRelevancyMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">answer_relevancy = </span><span class="NormalTextRun SCXW8241585 BCX0">AnswerRelevancyMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="LineBreakBlob BlobObject DragDrop SCXW8241585 BCX0"><span class="SCXW8241585 BCX0"> </span><br class="SCXW8241585 BCX0" /></span><span class="TextRun SCXW8241585 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW8241585 BCX0">faithfulness = </span><span class="NormalTextRun SCXW8241585 BCX0">FaithfulnessMetric</span><span class="NormalTextRun SCXW8241585 BCX0">()</span></span><span class="EOP SCXW8241585 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-5d2ddcf elementor-widget elementor-widget-text-editor" data-id="5d2ddcf" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW77075170 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW77075170 BCX0">We then define test cases:</span></span><span class="EOP SCXW77075170 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-f16e61d elementor-widget elementor-widget-text-editor" data-id="f16e61d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">test_case = LLMTestCase(  </span> <br /><span data-contrast="auto">    input=#user prompt,  </span> <br /><span data-contrast="auto">    actual_output=#model output here,  </span> <br /><span data-contrast="auto">    expected_output=#the ground truth response </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span><br /><br /><span data-contrast="auto">    retrieval_context=#data extracted by retriever, in our case it is data extracted from the database</span> <br /><span data-contrast="auto">)</span> <br /><span data-ccp-props="{}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-466c61d elementor-widget elementor-widget-text-editor" data-id="466c61d" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW131448305 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW131448305 BCX0">Here is </span><span class="NormalTextRun SCXW131448305 BCX0">one of</span><span class="NormalTextRun SCXW131448305 BCX0"> test case</span><span class="NormalTextRun SCXW131448305 BCX0">s</span><span class="NormalTextRun SCXW131448305 BCX0"> we used to </span><span class="NormalTextRun SCXW131448305 BCX0">evaluate our </span><span class="NormalTextRun SCXW131448305 BCX0">SeaStat</span> <span class="NormalTextRun SCXW131448305 BCX0">Assitant</span><span class="NormalTextRun SCXW131448305 BCX0">:</span></span><span class="EOP SCXW131448305 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-53c56bc elementor-widget elementor-widget-text-editor" data-id="53c56bc" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">test_case = LLMTestCase(  </span> <br /><span data-contrast="auto">    input='Compare cargo traffic in Suez Canal and Panama Canal in 2019',  </span> <br /><span data-contrast="auto">    actual_output= 'In 2019, the cargo traffic data for the Suez Canal and Panama Canal was as follows: Suez Canal - 1031 million tons; Panama Canal - 243059 thousand tons. The Suez Canal had significantly higher cargo traffic compared to the Panama Canal in 2019.' </span> <br /><span data-contrast="auto">    expected_output=' In 2019, the Suez Canal handled 1,031 million tons of cargo, whereas the Panama Canal transported only 243 million tons. This indicates that the Suez Canal carried a substantially higher volume of cargo than the Panama Canal that year.' </span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span><br /><br /><span data-contrast="auto">    retrieval_context=[</span><span data-ccp-props="{}"> </span><br /><br /><span data-contrast="auto">{'table': 'Suez_Canal_Cargo_Traffic', 'year': 2019, 'cargo_volume_million_tons': 1031},</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span><br /><br /><span data-contrast="auto">{'table': 'Panama_Canal_Cargo_Traffic', 'year': 2019, 'direction': 'Atlantic – Pacific', 'cargo_volume_thousand_tons': 156899}, {'table': 'Panama_Canal_Cargo_Traffic', 'year': 2019, 'direction': 'Pacific – Atlantic', 'cargo_volume_thousand_tons': 86160}</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span><br /><br /><span data-contrast="auto">]</span> <br /><span data-contrast="auto">)</span> </pre>						</div>
				</div>
				<div class="elementor-element elementor-element-2c1ba07 elementor-widget elementor-widget-text-editor" data-id="2c1ba07" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span class="TextRun SCXW81219040 BCX0" lang="EN-GB" xml:lang="EN-GB" data-contrast="auto"><span class="NormalTextRun SCXW81219040 BCX0">And run evaluation:</span></span><span class="EOP SCXW81219040 BCX0" data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559731&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-fb89c70 elementor-widget elementor-widget-text-editor" data-id="fb89c70" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<pre><span data-contrast="auto">assert_test(test_case, [correctness_metric, answer_relevancy, contextual_precision, contextual_recall, contextual_relevancy, faithfulness])</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:1,&quot;335551620&quot;:1,&quot;335559685&quot;:0,&quot;335559731&quot;:0,&quot;335559737&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:279}"> </span></pre>						</div>
				</div>
				<div class="elementor-element elementor-element-9893049 elementor-widget elementor-widget-heading" data-id="9893049" data-element_type="widget" data-widget_type="heading.default">
				<div class="elementor-widget-container">
			<h4 class="elementor-heading-title elementor-size-default">4. Testing results </h4>		</div>
				</div>
				<div class="elementor-element elementor-element-283d669 elementor-widget elementor-widget-text-editor" data-id="283d669" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">DeepEval assigns each metric a score between 0 and 1, accompanied by a descriptive explanation of the rating. Below are the results from a test case evaluating SeaStat&#8217;s response to the prompt:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><b><span data-contrast="auto">&#8220;Compare cargo traffic in the Suez Canal and Panama Canal in 2019.&#8221;</span></b><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Metric interpretations:</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Contextual Recall</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> &#8211; The retriever effectively retrieved the necessary information, meaning that almost all essential details from the expected output were present in the retrieval context.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="8" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Contextual Relevancy (0.95)</span></b><span data-contrast="auto"> and </span><b><span data-contrast="auto">Contextual Precision (1.0)</span></b><span data-contrast="auto"> &#8211; The retrieved context was highly relevant to the query, showing that the retriever pulled information accurately related to the input.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><b><span data-contrast="auto">Faithfulness</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> &#8211; The model’s response remained perfectly factual, strictly adhering to the retrieved information without introducing any hallucinations.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><b><span data-contrast="auto">Answer Relevancy</span></b> <b><span data-contrast="auto">(1.0)</span></b><span data-contrast="auto"> – The model&#8217;s response fully addressed the user query, ensuring that the answer was on point.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><ul><li data-leveltext="" data-font="Symbol" data-listid="9" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559683&quot;:0,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><b><span data-contrast="auto">Correctness</span></b><span data-contrast="auto">, </span><b><span data-contrast="auto">(0.78)</span></b><span data-contrast="auto"> – the correctness score was slightly lower due to numerical discrepancies caused by rounding.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0}"> </span></li></ul><p><span data-contrast="auto">By systematically analyzing test cases with DeepEval, we gain valuable insights into where our RAG model excels and where improvements are needed. Future optimizations could include refining retrieval strategies, adjusting prompt engineering, or fine-tuning LLM parameters for better factual accuracy.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-0445df6 elementor-widget elementor-widget-text-editor" data-id="0445df6" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<table style="font-weight: 400;" data-tablestyle="MsoTableGrid" data-tablelook="1696" aria-rowcount="7"><tbody><tr aria-rowindex="1"><td data-celllook="0"><p><b><span data-contrast="auto">Test case</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Metric</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Score</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Status</span></b><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><b><span data-contrast="auto">Overall Success Rate</span></b><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="2"><td colspan="1" rowspan="6" data-celllook="0"><p><span data-contrast="auto">test_case_0</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">Correctness (GEval)</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">0.78 (threshold=0.5, evaluation model=gpt-4o, reason=The actual output closely matches the expected output in terms of cargo volumes and comparative conclusion, but the numbers are expressed in different units (thousand tons vs million tons) and slightly differ, which may indicate rounding or conversion discrepancies., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td><td colspan="1" rowspan="6" data-celllook="0"><p><span data-contrast="auto">100%</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="3"><td data-celllook="0"><p><span data-contrast="auto">Answer Relevancy</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because the response thoroughly addressed the comparison of cargo traffic in the Suez Canal and the Panama Canal in 2019 with no irrelevant details included. It&#8217;s precise and to the point, showcasing a deep understanding of the topic., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="4"><td data-celllook="0"><p><span data-contrast="auto">Contextual Precision</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because the relevant nodes, offering essential data for comparing cargo traffic in the Suez and Panama Canals in 2019, are perfectly ranked at the top. These nodes effectively deliver a comprehensive breakdown of cargo volumes through both canals during that year, ensuring accurate comparisons can be made efficiently., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="5"><td data-celllook="0"><p><span data-contrast="auto">Contextual Recall</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 1.00 because every sentence in the expected output aligns perfectly with the data from the nodes in the retrieval context, effectively illustrating the significant difference in cargo volumes handled by both canals. Well done on maintaining precise and accurate attention to detail!, error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="6"><td data-celllook="0"><p><span data-contrast="auto">Contextual Relevancy</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">0.95 (threshold=0.5, evaluation model=gpt-4o, reason=The score is 0.95 because although the context is rich with detailed data on Suez Canal cargo traffic, it lacks specific information on the Panama Canal&#8217;s cargo traffic, necessitating additional data for a complete comparison., error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr><tr aria-rowindex="7"><td data-celllook="0"><p><span data-contrast="auto">Faithfulness</span><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">1.0 (threshold=0.5, evaluation model=gpt-4o, reason=Awesome job! The score is 1.00 because there are no contradictions present, showcasing perfect alignment and faithfulness of the actual output to the retrieval context. Keep up the excellent work!, error=None)</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:0,&quot;335551620&quot;:0,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-ccp-props="{}"> </span></p></td><td data-celllook="0"><p><span data-contrast="auto">PASSED</span><span data-ccp-props="{}"> </span></p></td></tr></tbody></table>						</div>
				</div>
				<div class="elementor-element elementor-element-abdf550 elementor-widget elementor-widget-text-editor" data-id="abdf550" data-element_type="widget" data-widget_type="text-editor.default">
				<div class="elementor-widget-container">
							<p><span data-contrast="auto">Evaluating Retrieval-Augmented Generation (RAG) models requires a structured approach to ensure both retrieval accuracy and response reliability. </span><span data-contrast="auto">LLM-as-a-Judge</span> <span data-contrast="auto">provides</span><span data-contrast="auto"> an efficient alternative to human evaluation by systematically assessing outputs based on predefined criteria, enabling scalable and cost-effective validation.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">Using </span><span data-contrast="auto">DeepEval</span><span data-contrast="auto">, we tested our AI-driven </span><span data-contrast="auto">SeaStat</span><span data-contrast="auto"> Assistant</span><span data-contrast="auto"> against key evaluation metrics, including </span><span data-contrast="auto">Correctness (G-Eval), Answer Relevancy, Contextual Precision, Contextual Recall, Contextual Relevancy, and Faithfulness</span><span data-contrast="auto">. The results highlighted </span><span data-contrast="auto">minor discrepancies in numerical representation, missing contextual details, and retrieval precision—insights crucial f</span><span data-contrast="auto">o</span><span data-contrast="auto">r refining model performance.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">These findings emphasize that </span><span data-contrast="auto">even high-performing RAG models require rigorous evaluation to ensure factual accuracy and prevent misleading outputs</span><span data-contrast="auto">. By automating this process, we enable continuous model improvement, ensuring </span><span data-contrast="auto">AI-driven assistants deliver reliable, context-aware insights at scale</span><span data-contrast="auto">.</span> <span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p><p><span data-contrast="auto">AI-powered assistants are undoubtedly a technology that will become an indispensable tool for employees at all levels—from executives and directors to specialists. Their dynamic development allows them to instantly adapt to business needs and evolving expectations.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559738&quot;:240,&quot;335559739&quot;:240}"> </span></p>						</div>
				</div>
				<div class="elementor-element elementor-element-308ac2d elementor-cta--skin-cover elementor-animated-content elementor-bg-transform elementor-bg-transform-zoom-in elementor-widget elementor-widget-call-to-action" data-id="308ac2d" data-element_type="widget" data-widget_type="call-to-action.default">
				<div class="elementor-widget-container">
					<div class="elementor-cta">
					<div class="elementor-cta__bg-wrapper">
				<div class="elementor-cta__bg elementor-bg" style="background-image: url(https://inero-software.com/wp-content/uploads/2024/12/3-1030x1030.png);" role="img" aria-label="3"></div>
				<div class="elementor-cta__bg-overlay"></div>
			</div>
							<div class="elementor-cta__content">
				
									<h2 class="elementor-cta__title elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						We create reliable AI assistants					</h2>
				
									<div class="elementor-cta__description elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
						If you're looking for a company to help you implement an AI-based solution, reach out to us. We’d be happy to discuss your idea.					</div>
				
									<div class="elementor-cta__button-wrapper elementor-cta__content-item elementor-content-item elementor-animated-item--grow">
					<a class="elementor-cta__button elementor-button elementor-size-" href="https://inero-software.com/contact-us/">
						Contact Us					</a>
					</div>
							</div>
						</div>
				</div>
				</div>
				</div>
				</div>
		<div class="elementor-element elementor-element-961021e e-con-full e-flex e-con e-child" data-id="961021e" data-element_type="container">
				</div>
					</div>
				</div>
				</div>
		<p>Artykuł <a href="https://inero-software.com/assessing-retrieval-augmented-generation-rag-large-language-models-llms-with-deepeval-for-complex-tabular-data/">Assessing Retrieval-Augmented Generation (RAG) Large Language Models (LLMs) with DeepEval for Complex Tabular Data</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">6902</post-id>	</item>
		<item>
		<title>A year under the sign of artificial intelligence development</title>
		<link>https://inero-software.com/ai-year-summary/</link>
		
		<dc:creator><![CDATA[Marta Kuprasz]]></dc:creator>
		<pubDate>Mon, 18 Dec 2023 10:32:30 +0000</pubDate>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[Company]]></category>
		<category><![CDATA[AI]]></category>
		<category><![CDATA[AI development]]></category>
		<category><![CDATA[AI innovations]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[Copilot]]></category>
		<category><![CDATA[Gemini]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[Micosoft]]></category>
		<category><![CDATA[Natural Language Processing]]></category>
		<category><![CDATA[OpenAI]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=5324</guid>

					<description><![CDATA[<p>The end of the year is a time for summaries. In the world of IT, many interesting things have happened, so in this article, we decided to focus on AI. The development of artificial intelligence and its media presence accelerated to an unprecedented scale. Tools based on Large Language Models&#8230;</p>
<p>Artykuł <a href="https://inero-software.com/ai-year-summary/">A year under the sign of artificial intelligence development</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h3></h3>
<p><span data-contrast="auto">The end of the year is a time for summaries. In the world of IT, many interesting things have happened, so in this article, we decided to focus on AI. The development of artificial intelligence and its media presence accelerated to an unprecedented scale. Tools based on Large Language Models (LLMs) have been popularized and made widely available to users from various industries, not just technological ones. We decided to summarize the year with Andrzej Chybicki, the CEO of Inero Software. Here is the list he identified as the key 5 events of the past year.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h3><span data-contrast="auto">Fact 1: OpenAI &#8211; artificial intelligence becomes widely accessible</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h3>
<p><span data-contrast="auto">OpenAI played a tremendous role in popularizing the field of artificial intelligence in the context of human language understanding. In 2022, they released ChatGPT, and in the following months, they presented new, improved models. These advancements not only improved the performance of existing applications but also opened new avenues for AI in healthcare, environmental science, administration, marketing, and more. </span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">In 2023, ChatGPT saw remarkable advancements, featuring enhanced learning algorithms for improved accuracy and nuanced conversations, personalized user interactions, expanded language support for global accessibility, and broader application integration. OpenAI emphasized ethical considerations and bias reduction, incorporated real-time learning for up-to-date content, improved multimedia interaction capabilities, and boosted the tool&#8217;s robustness and reliability. Additionally, ChatGPT was tailored for specific industries, providing specialized functionalities and knowledge, marking a significant leap in AI technology and user-centric applications.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h5><b><span data-contrast="auto">Expert Insight</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h5>
<p><span data-contrast="auto">OpenAI was the first widely recognized large language model. In the coming years, we are likely to see various versions of LLMs designed for specific applications &#8211; in fact, this has been happening for a few months now. OpenAI, despite being a pioneer, at least in terms of recognizability, is not always considered the best model for everything. The direction of development is certainly popularization in a similar way as it was with computers (i.e., LLMs like PCs) and specialization, meaning specialized language models designed for specific applications or even entities or people. </span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto"> </span></p>
<h3><span data-contrast="auto">Fact 2: GitHub Copilot &#8211; </span><span data-contrast="auto">a leader in AI/LLM implementation</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h3>
<p><span data-contrast="auto">One of the key roles in the development of artificial intelligence is played by Microsoft, which collaborates with OpenAI. Over the past year, Microsoft has continued to refine its vision of Microsoft Copilot. Let&#8217;s focus on the solution for developers: GitHub Copilot. In 2023 it underwent significant changes and enhancements. Here are the key updates:</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">In 2023, GitHub Copilot introduced several significant enhancements to bolster its role in AI-driven software development. The GitHub Copilot Chat, now generally available and powered by OpenAI&#8217;s GPT-4, provides more accurate code suggestions and explanations, using natural language to aid developers in various languages. This feature is integrated with both the GitHub platform and its mobile app, supporting coding, pull requests, and documentation. Additionally, GitHub Copilot Enterprise was introduced to tailor the tool to specific organizational needs, helping developers quickly adapt to their organization’s codebase and streamline tasks like documentation and pull request reviews, aimed at boosting enterprise-level productivity and security. The GitHub Copilot Partner Program was launched, integrating Copilot with various third-party developer tools and services, thereby creating a broad ecosystem that enhances the capabilities of developers using AI. Finally, GitHub unveiled new AI-powered security features in its Advanced Security suite, including a real-time vulnerability prevention system and application security testing features to detect and remediate code vulnerabilities and secrets, further securing the software development process.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><b><span data-contrast="none"> </span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h5><b><span data-contrast="auto">Expert Insight</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h5>
<p><span data-contrast="auto">Thanks to its collaboration with OpenAI, Microsoft became a leader in AI/LLM implementation worldwide in 2023. Microsoft&#8217;s strategy in this area is based on using the LLM model to support (but not replace) as many activities and processes using Microsoft products as possible. Particularly important was ensuring an appropriate level of SLA (aligned with other Azure services) and data security. Among the most significant changes, apart from the mentioned GitHub Copilot (which aims to support developers in coding), are Copilot plugins available in practically all of this company&#8217;s flagship products (Word, Excel, PowerPoint, Outlook).</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">In December 2023, Microsoft also presented the CoPilot Studio solution, which enables the creation of low-code/no-code IT systems with significant support from the OpenAI model. This effectively allows for the easy expansion of existing Azure low-code solutions such as Azure Agents with conversational bots or AI-supported database adapters. Although CoPilot Studio is not yet available in its final form, Microsoft clearly communicates development directions and the advantages that developers, engineers, and users can experience from its use. From the presentations of Microsoft representatives, it can be inferred that Microsoft&#8217;s goal is to lower the entry threshold for creating and implementing new advanced AI solutions, as using low-code platforms does not require as deep technical knowledge as traditional coding. We can expect widespread interest in these solutions not only from the largest companies using MS Azure in the coming years. Currently, among experts, the question is not “whether to use AI” but how to implement it to not fall behind the competition. Those entities that create a coherent strategy for incorporating AI-based products into their processes in the coming years will be able to significantly benefit from the revolution that is already taking place.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto"> </span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h3><span data-contrast="auto">Fact 3: The European AI Act: A Regulatory Milestone</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h3>
<p><span data-contrast="auto">On 14 June 2023, the European Parliament adopted its negotiating position on the AI Act. Parliament’s priority is to make sure that AI systems used in the EU are safe, transparent, traceable, non-discriminatory and environmentally friendly. Parliament also wants to establish a technology-neutral, uniform definition for AI that could be applied to future AI systems. The AI Act sets different rules for different AI risk levels.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">The new rules establish obligations for providers and users depending on the level of risk from artificial intelligence. While many AI systems pose minimal risk, they need to be assessed.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><b><span data-contrast="auto">Unacceptable risk</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">Unacceptable risk AI systems are systems considered a threat to people and will be banned. They include:</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<ul>
<li data-leveltext="·" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;·&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="1" data-aria-level="1"><span data-contrast="auto">Cognitive behavioral manipulation of people or specific vulnerable groups: for example voice-activated toys that encourage dangerous behavior in children</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li data-leveltext="·" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;·&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="2" data-aria-level="1"><span data-contrast="auto">Social scoring: classifying people based on behavior, socioeconomic status or personal characteristics</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li data-leveltext="·" data-font="Symbol" data-listid="7" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;·&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="3" data-aria-level="1"><span data-contrast="auto">Real-time and remote biometric identification systems, such as facial recognition</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
</ul>
<p><span data-contrast="auto">Some exceptions may be allowed: For instance, “post” remote biometric identification systems where identification occurs after a significant delay will be allowed to prosecute serious crimes but only after court approval.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><b><span data-contrast="auto">High risk</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">AI systems that negatively affect safety or fundamental rights will be considered high-risk and will be divided into two categories:</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">1) AI systems that are used in products falling under the EU’s product safety legislation. This includes toys, aviation, cars, medical devices and lifts.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">2) AI systems falling into eight specific areas that will have to be registered in an EU database:</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<ul>
<li data-leveltext="·" data-font="Symbol" data-listid="10" data-list-defn-props="{&quot;335552541&quot;:1,&quot;335559684&quot;:-2,&quot;335559685&quot;:720,&quot;335559991&quot;:360,&quot;469769226&quot;:&quot;Symbol&quot;,&quot;469769242&quot;:[8226],&quot;469777803&quot;:&quot;left&quot;,&quot;469777804&quot;:&quot;·&quot;,&quot;469777815&quot;:&quot;hybridMultilevel&quot;}" aria-setsize="-1" data-aria-posinset="4" data-aria-level="1"><span data-contrast="auto">Biometric identification and categorisation of natural persons</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Management and operation of critical infrastructure</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Education and vocational training</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Employment, worker management and access to self-employment</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Access to and enjoyment of essential private services and public services and benefits</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Law enforcement</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Migration, asylum and border control management</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
<li><span data-contrast="auto">Assistance in legal interpretation and application of the law.</span><span data-ccp-props="{&quot;134233117&quot;:false,&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559738&quot;:0,&quot;335559739&quot;:0,&quot;335559740&quot;:259}"> </span></li>
</ul>
<p><span data-contrast="auto">All high-risk AI systems will be assessed before being put on the market and also throughout their lifecycle. </span><a href="https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence"><span data-contrast="none">For more information, visit the European Parliament website.</span></a><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">*source: </span><a href="https://www.europarl.europa.eu/"><span data-contrast="none">https://www.europarl.europa.eu</span></a><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h5><b><span data-contrast="auto">Expert Insight</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h5>
<p><span data-contrast="auto">Ensuring security and confidentiality of data is certainly one of the most important issues concerning the implementation of AI solutions. Many experts indicate that despite the good intentions of the European Commission, the proposed solutions may contribute to reducing the competitiveness of the domestic AI market, which in effect will increase the distance between Europe and leaders in this field (i.e., the USA and China). I personally share these concerns. Here, a good example might be the similar situation that occurred about 15 years ago when cloud computing was being implemented. At that time, the EU also created a regulation governing the rules of access and data confidentiality (GDPR), which to this day is the regulatory basis in this area. At the same time, the largest solutions that most in the EU use are those developed in the USA, where the priority was the free development of technology, and only secondarily the legal framework. Unfortunately, many indications suggest that a similar situation might occur with AI.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p>&nbsp;</p>
<h3><span data-contrast="auto">Fact 4: Gemini: new model from Google</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h3>
<p><span data-contrast="auto">Without a doubt, the launch of Gemini was the most prominent premiere in the latter part of 2023, generating significant buzz. It is a result of large-scale collaborative efforts by teams across Google. It was built from the ground up to be multimodal, which means it can generalize and seamlessly understand, operate across, and combine different types of information including text, code, audio, image, and video.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">Gemini 1.0 was trained to recognize and understand text, images, audio, and more at the same time, so it better understands nuanced information and can answer questions relating to complicated topics. This makes it especially good at explaining reasoning in complex subjects like math and physics.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">During the presentation on the release of the Gemini API for developers, a lot of time was dedicated to AI Studio, a browser-based, free tool for code creation. The second focus was on Vertex AI, a more advanced program that allows for &#8220;both training and deploying ML (machine learning) models and AI applications.&#8221; Google offers the option to transfer a preliminary project developed in AI Studio to Vertex AI, to add additional features available within the larger platform of Google Cloud.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h5><b><span data-contrast="auto">Expert Insight</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h5>
<p><span data-contrast="auto">Google has officially joined the large language model (LLM) race. The most intriguing aspect of what they propose is that their model will operate in three versions: Ultra (the most feature-rich), Pro, and Nano, with the latter being designed for mobile phones. It&#8217;s still unclear whether Nano will run entirely on client devices (smartphones) or if it will simply be a thin client and a kind of extension of Google Assistant. It&#8217;s also worth emphasizing that Google, like Microsoft, will offer Gemini services as elements of its flagship products, such as Google Sheets, Google Docs, and others.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto"> </span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h3><span data-contrast="auto">Fact 5: Advancements in Natural Language Processing (NLP)</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h3>
<p><span data-contrast="auto">2023 witnessed remarkable progress in the field of Natural Language Processing. Researchers and companies globally made significant strides in improving the accuracy and versatility of NLP models. These advancements have led to more sophisticated understanding and the generation of human language by machines, paving the way for more intuitive and natural human-computer interactions. This year saw the deployment of advanced NLP in various applications, from customer service chatbots to complex data analysis tools, revolutionizing how we interact with technology daily. This progress in NLP technology not only enhanced existing applications but also opened new possibilities for AI in fields such as education, content creation, and multilingual communication.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<h5><b><span data-contrast="auto">Expert Insight</span></b><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></h5>
<p><span data-contrast="auto">AI technologies are increasingly breaking the barrier of understanding natural language, gradually blurring the line between structured data previously used in IT systems and human knowledge. It seems that the creation of AGI (Artificial General Intelligence), a machine matching or even surpassing the average human in many aspects, is now just a matter of time. The challenge for the world of science, business, and politics will now be to direct the development of AI in a way that serves the broadly understood humanity and does not cause threats that many (probably rightly) fear.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><span data-contrast="auto">The last 12 months have been rich in interesting AI releases. The presentation of new large language models has opened up a range of possibilities for their implementation in everyday tasks, both in programming work and creative teams. European authorities are trying to keep up with these changes and adapt legal regulations to be in line with the current technological situation. In the coming months, we will certainly see more premieres, as leading players like Google and Microsoft compete to create solutions that utilize artificial intelligence.</span><span data-ccp-props="{&quot;134233118&quot;:false,&quot;201341983&quot;:0,&quot;335551550&quot;:6,&quot;335551620&quot;:6,&quot;335559739&quot;:160,&quot;335559740&quot;:257}"> </span></p>
<p><a href="https://inero-software.com/contact-us/"><img loading="lazy" decoding="async" data-attachment-id="5331" data-permalink="https://inero-software.com/ai-year-summary/banner-18-12-2/" data-orig-file="https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2.png" data-orig-size="2250,375" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="banner 18.12. (2)" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-300x50.png" data-large-file="https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-1030x172.png" tabindex="0" role="button" class="wp-image-5331 aligncenter" src="https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-300x50.png" alt="" width="1058" height="176" srcset="https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-300x50.png 300w, https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-1030x172.png 1030w, https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-768x128.png 768w, https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-1536x256.png 1536w, https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-2048x341.png 2048w, https://inero-software.com/wp-content/uploads/2023/12/banner-18.12.-2-1520x253.png 1520w" sizes="(max-width: 1058px) 100vw, 1058px" /></a></p>
<p>&nbsp;</p>
<p><span data-ccp-props="{&quot;201341983&quot;:0,&quot;335559739&quot;:160,&quot;335559740&quot;:259}"> </span></p>
<p>Artykuł <a href="https://inero-software.com/ai-year-summary/">A year under the sign of artificial intelligence development</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">5324</post-id>	</item>
	</channel>
</rss>
