<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	
	xmlns:georss="http://www.georss.org/georss"
	xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#"
	>

<channel>
	<title>data extraction - Inero Software - Software Consulting</title>
	<atom:link href="https://inero-software.com/tag/data-extraction/feed/" rel="self" type="application/rss+xml" />
	<link>https://inero-software.com/tag/data-extraction/</link>
	<description>We unleash innovations using cutting-edge technologies, modern design and AI</description>
	<lastBuildDate>Mon, 27 Feb 2023 09:56:08 +0000</lastBuildDate>
	<language>en-GB</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.1</generator>

<image>
	<url>https://inero-software.com/wp-content/uploads/2018/11/inero-logo-favicon.png</url>
	<title>data extraction - Inero Software - Software Consulting</title>
	<link>https://inero-software.com/tag/data-extraction/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">153509928</site>	<item>
		<title>Document Data Processing using GPT model</title>
		<link>https://inero-software.com/document-data-processing-using-gpt-model/</link>
		
		<dc:creator><![CDATA[Marceli Formela]]></dc:creator>
		<pubDate>Thu, 09 Feb 2023 14:49:42 +0000</pubDate>
				<category><![CDATA[Company]]></category>
		<category><![CDATA[ChatGPT]]></category>
		<category><![CDATA[data extraction]]></category>
		<category><![CDATA[DataDrivenDecisionMaking]]></category>
		<category><![CDATA[GPT]]></category>
		<category><![CDATA[Large Language Model]]></category>
		<category><![CDATA[LLM]]></category>
		<guid isPermaLink="false">https://inero-software.com/?p=4993</guid>

					<description><![CDATA[<p>GPT &#8211; A DIFFERENT POINT OF VIEW In the previous post, we looked at the possibilities of the GPT model in terms of supporting the work of a programmer. But that&#8217;s not the only thing this technology can do. GPT is a language model that utilizes a deep learning technique&#8230;</p>
<p>Artykuł <a href="https://inero-software.com/document-data-processing-using-gpt-model/">Document Data Processing using GPT model</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></description>
										<content:encoded><![CDATA[<h2></h2>
<h2 style="text-align: center;"><strong><span style="color: #800080;">GPT &#8211; A DIFFERENT POINT OF VIEW</span><br />
</strong></h2>
<p><img decoding="async" data-attachment-id="2338" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/inero-glify-11-3/" data-orig-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" data-orig-size="417,208" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="" data-image-description="&lt;p&gt;Inero Software&lt;/p&gt;
" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" data-large-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" tabindex="0" role="button" class="aligncenter wp-image-2338" src="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" alt="Separating icon" width="92" height="46" srcset="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png 300w, https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png 417w" sizes="(max-width: 92px) 100vw, 92px" /></p>
<p><img decoding="async" data-attachment-id="2768" data-permalink="https://inero-software.com/data-the-playground-of-machine-learning/inero-glify-08/" data-orig-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" data-orig-size="208,208" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Inero Glyph v8" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" data-large-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" tabindex="0" role="button" class="wp-image-2768 alignleft" src="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" alt="Paragrapgh icon" width="77" height="77" srcset="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png 208w, https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08-80x80.png 80w, https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08-50x50.png 50w" sizes="(max-width: 77px) 100vw, 77px" />In the <a href="https://inero-software.com/chatgpt-a-developers-point-of-view/">previous post</a>, we looked at the possibilities of the <a href="https://platform.openai.com/overview">GPT</a> model in terms of supporting the work of a programmer. But that&#8217;s not the only thing this technology can do. GPT is a language model that utilizes a deep learning technique known as transformer. For example, this allows it to produce text that mimics human writing. It has been trained on an extensive collection of text found on the internet and can be adjusted for specific tasks such as language translation, text summarization and text generation. The model is multilingual and can respond to a broad spectrum of questions, commands and prompts. Additionally, it is capable of generating creative content like stories and poetry. It could be applied in various fields such as virtual assistants, chatbots and automated writing. However, it&#8217;s important to keep in mind that while it is highly advanced, GPT is not human and may not fully understand context of language nuances, thus it may produce errors or biases present in the training data.</p>
<p>We asked <a href="https://platform.openai.com/overview" target="_blank" rel="noopener">ChatGPT</a> question: &#8220;Describe your ability to summarize texts.&#8221;</p>
<p>Output: &#8220;As a language model, I am able to summarize documents by identifying and extracting the main points and condensing them into a shorter version while retaining the key information. I can summarize documents of various formats such as text, PDFs, and articles. However, my ability to summarize a document may depend on the length and complexity of the document, as well as the quality of my fine-tuning. Additionally, my summary may not always be entirely accurate, as I am not capable of understanding the context or nuances in the same way a human would, but I can provide a summary which is more readable, informative and precise based on the data I have been trained on.&#8221;</p>
<p>Based on this information, we will conduct an experiment to test the efficiency of davinci-003 model during analyzing some documents that were prepared by us especially for this purpose. However, if we want the model to be able to effectively analyze the content of PDFs documents, we should provide it in the form of a prompt. For this we will use OCR technology and pyTesseract.</p>
<p><img decoding="async" data-attachment-id="2338" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/inero-glify-11-3/" data-orig-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" data-orig-size="417,208" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="" data-image-description="&lt;p&gt;Inero Software&lt;/p&gt;
" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" data-large-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" tabindex="0" role="button" class="aligncenter wp-image-2338" src="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" alt="Separating icon" width="92" height="46" srcset="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png 300w, https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png 417w" sizes="(max-width: 92px) 100vw, 92px" /></p>
<h4 style="text-align: center;"><span style="color: #800080;"><strong>PREPARING THE DATA AND THE CODE</strong></span></h4>
<h5 style="text-align: center;"><span style="color: #800080;"><strong>USING OCR</strong></span></h5>
<p>OCR stands for Optical Character Recognition, which is a technology used to convert scanned images or PDFs of text into machine-readable text. pyTesseract is a Python wrapper for Google&#8217;s Tesseract-OCR Engine, which allows for easy integration of OCR capabilities into Python scripts. pyTesseract can be used to extract text from images, scanned documents, or PDFs. It is an open-source library, and it can be easily installed via pip.</p>
<p><img fetchpriority="high" decoding="async" data-attachment-id="4994" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/gpt-pyt-01/" data-orig-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01.png" data-orig-size="697,313" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="GPT-PYT-01" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01-300x135.png" data-large-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01.png" tabindex="0" role="button" class="size-full wp-image-4994 aligncenter" src="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01.png" alt="" width="697" height="313" srcset="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01.png 697w, https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01-300x135.png 300w, https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-01-668x300.png 668w" sizes="(max-width: 697px) 100vw, 697px" /></p>
<p>&nbsp;</p>
<h5 style="text-align: center;"><span style="color: #800080;"><strong>INTEGRATION WITH AN API</strong></span></h5>
<p>The next step is to use endpoint prepared for handling text completions. We can interact with the API through HTTP requests from many languages, for example via official Python bindings, Node.js library, or a community-maintained library. This time we will make use of the first one. We can add it to the project simply using following command:</p>
<blockquote><p>pip install openai</p></blockquote>
<p>All requests should include our API key in an Authorization HTTP header. A key should be retrieved from API Keys page and used in every request.</p>
<blockquote><p>openai.api_key = os.getenv(&#8220;OPENAI_API_KEY&#8221;)</p></blockquote>
<p>In case of summarizing input text we can use endpoint that creates a completion for the provided prompt and parameters.</p>
<blockquote><p>POST https://api.openai.com/v1/completion</p></blockquote>
<ul>
<li><strong>model</strong> is the specific pre-trained language model that the OpenAI API will use to generate text.</li>
<li><strong>prompt</strong> is the input text that the API will use as a starting point for generating new text. This can be a sentence or a paragraph, and is used to provide context for the text generation.</li>
<li><strong>max_tokens</strong> is an integer value that specifies the maximum number of tokens (words or word pieces) that the API will generate in its response.</li>
<li><strong>temperature</strong> is a value that controls the &#8220;creativity&#8221; of the generated text. Lower values will produce text that is more conservative and similar to the input prompt, while higher values will produce text that is more varied and creative.</li>
</ul>
<p><img loading="lazy" decoding="async" data-attachment-id="4995" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/gpt-pyt-02/" data-orig-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02.png" data-orig-size="628,223" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="GPT-PYT-02" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02-300x107.png" data-large-file="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02.png" tabindex="0" role="button" class="size-full wp-image-4995 aligncenter" src="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02.png" alt="" width="628" height="223" srcset="https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02.png 628w, https://inero-software.com/wp-content/uploads/2023/01/GPT-PYT-02-300x107.png 300w" sizes="(max-width: 628px) 100vw, 628px" /></p>
<h5 style="text-align: center;"><span style="color: #800080;"><strong>DATA PREPARATION</strong></span></h5>
<p>Now we will use some Carrier’s Liability Insurance certificates containing fake data.</p>
<p>Carrier&#8217;s liability insurance is a type of insurance coverage that protects a carrier, such as a shipping company, against claims made by the customer for loss or damage to the goods being transported. It provides financial protection to the carrier in the event that goods are damaged or lost while in transit.</p>
<p>Here we have the structure of these documents:<br />
<img loading="lazy" decoding="async" data-attachment-id="4996" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/ocp-pyt/" data-orig-file="https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT.jpg" data-orig-size="1700,2200" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="OCP-PYT" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-232x300.jpg" data-large-file="https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-796x1030.jpg" tabindex="0" role="button" class="wp-image-4996 aligncenter" src="https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT.jpg" alt="" width="1000" height="1294" srcset="https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT.jpg 1700w, https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-232x300.jpg 232w, https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-796x1030.jpg 796w, https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-768x994.jpg 768w, https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-1187x1536.jpg 1187w, https://inero-software.com/wp-content/uploads/2023/01/OCP-PYT-1583x2048.jpg 1583w" sizes="(max-width: 1000px) 100vw, 1000px" /></p>
<p>For the purposes of this experiment, we have prepared 6 variants of such a document, which are available for download <a href="https://inero-software.com/wp-content/uploads/2023/01/TestDataOCP.zip"><strong>HERE</strong></a>. They differ, for example, in the insured, the insurance period, the limit of liability, the insurer, and the territorial scope.</p>
<p>For each of the variants, 7 questions were asked to check how well the model deals with understanding the content.</p>
<ul>
<li>Who is the insured? Give me name and NIP number.</li>
<li>Who is the insurer?</li>
<li>What is the limit in general liability?</li>
<li>What is the certificate number?</li>
<li>What is the scope of international transport?</li>
<li>Where this documents was issued?</li>
<li>What is the insurance period?</li>
</ul>
<p>Below are the answers:</p>
<p>OCP-01<br />
The insured is XYZ S.A. with NIP: 000-00-00-001. The insurer is Towarzystwo Ubezpieczen ABC S.A. The limit in general liability is 300.000,00 EUR per occurrence. The certificate number is 123456789. The scope of international transport is Europe including the countries of the former USSR, such as Russia, Belarus, Ukraine and Moldova, and the Asian part of Turkey. This document was issued in Warsaw, Poland and the insurance period is from 01.01.2023 to 01.01.2024.</p>
<p>OCP-02<br />
The insured is ZYX S.A. with NIP: 000-00-00-002. The insurer is Towarzystwo Ubezpieczen BCA S.A. The limit in general liability is 150.000,00 EUR per occurrence. The certificate number is 234567891. The scope of international transport is Europe including the countries of the former USSR, such as Russia, Belarus, Ukraine and Moldova, and the Asian part of Turkey. This document was issued in Warsaw, Poland and the insurance period is from 01.01.2024 to 01.01.2025.</p>
<p>OCP-03<br />
The insured is YXZ S.A. with NIP: 000-00-00-003. The insurer is Towarzystwo Ubezpieczen CBA S.A. The limit in general liability is 420.000,00 EUR per occurrence. The certificate number is 345678912. The scope of international transport is Europe including the countries of the former USSR, such as Russia, Belarus, Ukraine and Moldova, and the Asian part of Turkey. This document was issued in Warsaw, Poland and the insurance period is from 01.06.2023 to 01.06.2024.</p>
<p>OCP-04<br />
The insured is CCC S.A. with NIP: 000-00-00-004. The insurer is Towarzystwo Ubezpieczen VCX S.A. The limit in general liability is 500.000,00 EUR per occurrence. The certificate number is 456789123. The scope of international transport is Europe including the countries of the former USSR, such as Russia, Belarus, Ukraine and Moldova, and the Asian part of Turkey. This document was issued in Warsaw, Poland and the insurance period is from 01.02.2023 to 01.02.2024.</p>
<p>OCP-05<br />
The insured is CCTA S.A. with NIP: 000-00-00-005. The insurer is Towarzystwo Ubezpieczen VVV S.A. The limit in general liability is 440.000,00 EUR per occurrence. The certificate number is 678912345. The scope of international transport is countries of the European Union (excluding Spain and Portugal), including Germany. This document was issued in Warsaw, Poland and the insurance period is from 02.04.2023 to 02.04.2024.</p>
<p>OCP-06<br />
The insured is AVX S.A. with NIP: 000-00-00-005. The insurer is Towarzystwo Ubezpieczen HFX S.A. The limit in general liability is 750.000,00 EUR per occurrence. The certificate number is 567891234. The scope of international transport is countries of the European Union (excluding Romania, Bulgaria, Poland), including Germany. This document was issued in Warsaw, Poland and the insurance period is from 02.07.2023 to 02.07.2024.</p>
<p><img decoding="async" data-attachment-id="2338" data-permalink="https://inero-software.com/document-data-processing-using-gpt-model/inero-glify-11-3/" data-orig-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" data-orig-size="417,208" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="" data-image-description="&lt;p&gt;Inero Software&lt;/p&gt;
" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" data-large-file="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png" tabindex="0" role="button" class="aligncenter wp-image-2338" src="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png" alt="Separating icon" width="92" height="46" srcset="https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1-300x150.png 300w, https://inero-software.com/wp-content/uploads/2018/11/inero-glify-11-1.png 417w" sizes="(max-width: 92px) 100vw, 92px" /></p>
<h2 style="text-align: center;"><span style="color: #800080;"><strong>SUMMARY</strong></span></h2>
<p><img loading="lazy" decoding="async" data-attachment-id="2768" data-permalink="https://inero-software.com/data-the-playground-of-machine-learning/inero-glify-08/" data-orig-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" data-orig-size="208,208" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="Inero Glyph v8" data-image-description="" data-image-caption="" data-medium-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" data-large-file="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" tabindex="0" role="button" class="wp-image-2768 alignleft" src="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png" alt="Paragrapgh icon" width="66" height="66" srcset="https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08.png 208w, https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08-80x80.png 80w, https://inero-software.com/wp-content/uploads/2019/05/inero-glify-08-50x50.png 50w" sizes="(max-width: 66px) 100vw, 66px" />The model correctly interpreted the data contained in the documents, despite some potential issues related to OCR tool analyzing multi-column text. The model totally dealt with the text in Polish and correctly processed the information contained therein, despite the question written in English. Requests were processed separately, per document, but still consistency in text style was maintained. We can assume that in the near future such solutions will become helpful in broadly understood information processing.</p>
<p>Artykuł <a href="https://inero-software.com/document-data-processing-using-gpt-model/">Document Data Processing using GPT model</a> pochodzi z serwisu <a href="https://inero-software.com">Inero Software - Software Consulting</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">4993</post-id>	</item>
	</channel>
</rss>
