|
/ Documentation /Apps & Integrations Setup/ Using the Extract from File App in OttoKit

Using the Extract from File App in OttoKit

Need to pull structured data out of files in your workflows? Whether it’s HTML tables, CSV rows, Excel spreadsheets, PDFs, or calendar events, the Extract from File App makes it easy to parse and access the data you need without external tools or custom scripts.

You can extract from file URLs or raw data strings, choose how you want the data parsed, and map the output directly into your workflow steps. Clean, structured data ready to use wherever you need it.

Note: This integration extracts and parses file content. It doesn’t modify, execute, or generate files.

Actions in the Extract from File App

1) Extract From HTML

Pull structured data from HTML files or markup, perfect for scraping tables, forms, or other content.

To extract from HTML:

  1. Add a new action to your workflow.
  2. Search for Extract from File and select it.
  3. Choose Extract From HTML.
  4. Click Continue to open the Configure tab.
  5. Fill in the fields:
    • Data (required) – Specify or map the HTML file or data.
  6. Click Show Optional Fields for additional configuration:
    • Header Rows – Parse table headers separately from data (True or False).
    • Include Empty Cells – Include empty table cells in output (True or False).
    • Raw Data – Return a raw HTML string instead of parsed content (True or False).
    • Read As String – Use string encoding vs iconv decoding (True or False).
Image 11 1024x446
  1. Click Continue to open the Test Step tab.
  2. Click Test to extract your HTML data.
Image 14 1024x491
  1. If everything appears to be in order, click Save.

2) Extract From CSV

Parse CSV data into clean, structured rows you can map and use throughout your workflow.

To extract from CSV:

  1. Add a new action.
  2. Search for Extract from File.
  3. Select Extract From CSV.
  4. Click Continue.
  5. Fill in the fields:
    • Data (required) – Specify or map the CSV file or data.
  6. Click Show Optional Fields for additional configuration:
    • Delimiter – Specify the field separator, e.g., , or ;.
    • Encoding – Select encoding format (UTF-8 or ASCII).
    • Exclude BOM – Exclude BOM if present (True or False).
    • Header Rows – Treat first row as column headers (True or False).
    • Preserve Quotes – Handle unclosed quotes gracefully (True or False).
    • Include Empty Cells – Include empty cells as empty strings (True or False).
    • Max Number of Rows – Specify the max number of rows to load.
    • Raw Data – Return raw CSV text instead of parsed data (True or False).
    • Read As String – Use string encoding vs iconv (True or False).
Image 21 1024x492
  1. Click Continue to open the Test Step tab.
  2. Click Test, then Save.
Image 26 1024x493

3) Extract From XLSX

Extract data from Excel spreadsheets with support for specific sheets, ranges, and header parsing.

To extract from XLSX:

  1. Add a new action.
  2. Search for Extract from File and choose Extract From XLSX.
  3. Click Continue.
  4. Fill in the fields:
    • Data (required) – Specify or map the XLSX file or data.
  5. Click Show Optional Fields:
    • Sheet Name – Specify the specific sheet name to extract.
    • Header Rows – First row contains column headers (True or False).
    • Include Empty Cells – Include empty cells as empty strings (True or False).
    • Range – Specify the cell range (A1 notation like “A1:D10” or starting row number).
    • Raw Data – Return raw data without JSON conversion (True or False).
    • Read As String – Read string for special characters (True or False).
Image 12 1024x490
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 25 1024x488

4) Extract From ODS

Pull data from OpenDocument Spreadsheets with the same flexibility as Excel files.

To extract from ODS:

  1. Add a new action and select Extract From ODS.
  2. Configure the fields:
    • Data (required) – Specify or map the ODS file or data.
  3. Click Show Optional Fields:
    • Sheet Name – Specify the specific sheet name to extract.
    • Header Rows – First row contains column headers (True or False).
    • Include Empty Cells – Include empty cells as empty strings (True or False).
    • Range – Specify the cell range to extract.
    • Raw Data – Return raw data without parsing (True or False).
    • Read As String – Read string (True or False).
Image 15 1024x489
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 18 1024x488

5) Extract From XML

Parse XML files and access structured data with encoding options.

To extract from XML:

  1. Add the action and select Extract From XML.
  2. Configure:
    • Data (required) – Specify or map the XML file or data.
    • Encoding – UTF-8 or ASCII.
    • Strip BOM – Remove Byte Order Mark if present (True or False).
Image 20 1024x488
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 29 1024x489

6) Extract From TXT

Extract plain text data with options for line splitting and encoding.

To extract from TXT:

  1. Add the action and choose Extract From TXT.
  2. Configure:
    • Data (required) – Specify or map the TXT file or data.
  3. Click Show Optional Fields:
    • Encoding – Select the file encoding type (UTF-8 or ASCII).
    • Split Lines – Return an array of lines instead of a single text (True or False).
    • Remove Empty Lines – Remove empty lines when splitting (True or False).
Image 16 1024x488
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 30 1024x487

7) Extract From RTF

Pull text content from Rich Text Format files.

To extract from RTF:

  1. Add the action and select Extract From RTF.
  2. Configure:
    • Data (required) – Specify or map the RTF file or data.
    • Raw Data – Return raw RTF content instead of parsed text (True or False).
    • Read As String – Use string encoding for special characters (True or False).
Image 19 1024x487
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 27 1024x488

8) Extract From Base64

Decode Base64-encoded data into usable content.

To extract from Base64:

  1. Add the action and choose Extract From Base64.
  2. Configure:
    • Data (required) – Specify or map the Base64 file or data.
    • Encoding – Select the file encoding type (UTF-8 or ASCII).
Image 22 1024x488
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 28 1024x488

9) Extract From JSON

Parse JSON data and access nested values in your workflows.

To extract from JSON:

  1. Add the action and select Extract From JSON.
  2. Configure:
    • Data (required) – Specify or map the JSON file or data.
    • Encoding – Select the file encoding type (UTF-8 or ASCII).
    • Strip BOM – Remove Byte Order Mark if present (True or False).
Image 13 1024x488
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 24 1024x486

10) Extract From PDF

Pull text content from PDF documents with options for page handling and password protection.

To extract from PDF:

  1. Add the action and choose Extract From PDF.
  2. Configure:
    • Data (required) – Specify or map the PDF file or data.
  3. Click Show Optional Fields:
    • Join Pages – Combine all pages into a single text (true) or return an array of pages (false).
    • Max Pages – Specify the maximum number of pages to process (-1 for all).
    • Password – Specify the password for encrypted PDF files.
Image 17 1024x485
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 31 1024x490

11) Extract From ICS

Parse calendar event data from ICS files.

To extract from ICS:

  1. Add the action and select Extract From ICS.
  2. Configure:
    • Data (required) – Specify or map the ICS file or data.
    • Encoding – Select the file encoding type (UTF-8 or ASCII).
    • Strip BOM – Remove Byte Order Mark if present (True or False).
Image 23 1024x488
  1. Click Continue to open the Test Step tab.
  2. Click Test and Save.
Image 32 1024x486

12) Extract From XLS

Pull data from legacy Excel files with full control over sheets, ranges, and headers.

To extract from XLS:

  1. Add a new action and select Extract From XLS.
  2. Configure:
    • Data (required) – Specify or map the XLS file or data.
  3. Click Show Optional Fields:
    • Sheet Name – Specify the specific sheet name to extract.
    • Header Rows – First row contains column headers (True or False).
    • Include Empty Cells – Include empty cells as empty strings (True or False).
    • Range – Specify the cell range (A1 notation like “A1:D10” or starting row number).
    • Raw Data – Return raw data without JSON conversion (True or False).
    • Read As String – Read string for special characters (True or False).
  4. Click Continue to open the Test Step tab.
  5. Click Test and Save.

Practical Use Cases

Below are three practical examples showing how to put it to work.

Extract Text from a PDF Invoice and Save It to a Spreadsheet

Scenario: Your suppliers send invoices as PDF email attachments. Instead of opening each PDF and copying the details manually, you can use the Extract From PDF action to automatically pull the text content and save it to a Google Sheet for your accounts team to review.

Step 1: Set Up the Trigger

  1. Create a new workflow in OttoKit.
  2. Click the trigger area and search for Gmail.
  3. Select “New Email received” event as the trigger. 
  4. Connect your Gmail account.
  5. Click Save Trigger, then Fetch Data to load a sample email with a PDF attachment. This gives you the attachment URL to use in the next step.

Note: 

  • Make sure you fetch a sample email that actually has a PDF attachment attached. The attachment URL is what the Extract from File step will use to read the file.
  • You can add a Filter to progress the workflow if only an attachment is included and if the email is from a certain sender

Step 2: Add the Extract from File App

  1. Click the + button below the trigger.
  2. Search for Extract from File and select it.
  3. Choose Extract From PDF as the action.
  4. Click Continue to open the Configure tab.
  5. Data (required): map the attachment URL from the Gmail trigger using @. This tells OttoKit which PDF to read.
  6. Click Show Optional Fields and set the following:
FieldWhat to Enter
Join PagesSet to True to combine all pages into a single block of text. This makes it easier to map the output to a spreadsheet.
Max PagesEnter -1 to process all pages, or enter a number such as 3 to limit to the first three pages.
PasswordOnly fill this in if the PDF is password-protected.
  1. Click Continue, then click Test. The output will show the extracted text content from the PDF.
  2. Click Save.

Step 3: Save the Extracted Data to Google Sheets

  1. Click the + button below the Extract from File step.
  2. Search for “Google Sheets” and select “Add Row”.
  3. Connect your Google account and select the spreadsheet and sheet where invoices should be saved.
  4. Map the extracted text from the Extract from File step to the relevant columns in your spreadsheet. For example, map the full text to a column called Invoice Content, or use the Text Formatter app beforehand to split specific values out of the text.
  5. Click Test, then Save.
  6. Click Publish Workflow.

That is it. Now every time a PDF invoice arrives in your Gmail inbox, the workflow extracts the text and logs it to your spreadsheet automatically.

Parse a CSV Contact Export and Add Each Contact to a Mailing List

Scenario: Your CRM lets you export contacts as a CSV file. Each week, your team uploads the latest export to a Google Drive folder, and you want each new contact to be automatically added to your Mailchimp mailing list. The Extract From CSV action reads the file and makes each row available as data for the next step.

Step 1: Set Up the Trigger

  1. Create a new workflow in OttoKit.
  2. Click the trigger area and search for Google Drive.
  3. Select New File in Folder as the trigger event.
  4. Connect your Google account and select the folder where your team uploads CSV exports.
  5. Click Save Trigger, then Fetch Data to load a sample file upload. This gives you the file URL to use in the next step.

Step 2: Add the Extract from File App

  1. Click the + button below the trigger.
  2. Search for Extract from File and select it.
  3. Choose Extract From CSV.
  4. Click Continue.
  5. Data (required): map the file URL from the Google Drive trigger using @.
  6. Click Show Optional Fields and set the following:
FieldWhat to Enter
Header RowsSet to True so OttoKit treats the first row as column headers and maps each column by name.
DelimiterEnter a comma (,) if your CSV uses comma separation, or a semicolon (;) if it uses semicolons.
EncodingSelect UTF-8 for most standard CSV files.
  1. Click Continue, then click Test. The output will show the parsed rows with each column mapped by name.
  2. Click Save.

Note: The Extract from File app returns the rows as structured data. If your CSV has multiple rows, you can use a Loop step after this to process each contact one at a time.

Step 3: Add Each Contact to Your Mailing List

  1. Click the + button below the Extract from File step.
  2. Search for Mailchimp (or your preferred email marketing app) and select Add Subscriber.
  3. Connect your Mailchimp account and select the list or audience where contacts should be added.
  4. Email Address: Map the email column from the Extract from File step using @.
  5. First Name: Map the first name column.
  6. Last Name: Map the last name column.
  7. Click Test, then Save.
  8. Click Publish Workflow.

That is it. Now every time a CSV file is uploaded to the Google Drive folder, the workflow reads it and adds each contact to your mailing list automatically.

Extract Meeting Details from an ICS File and Log Them to Your CRM

Scenario: Clients sometimes send meeting invitations as .ics calendar files via email. Instead of opening each file and manually entering the event details into your CRM, you can use the Extract From ICS action to automatically read the event title, date, time, and description, then create a new activity record in your CRM.

Step 1: Set Up the Trigger

  1. Create a new workflow in OttoKit.
  2. Click the trigger area and search for Gmail.
  3. Select New Email received as the trigger event.
  4. Connect your Gmail account.
  5. Click Save Trigger, then Fetch Data. 

Note:

  • Make sure the sample email has an .ics file attached so you get the attachment URL in your sample data.
  • You can add a Filter to progress the workflow if only an attachment is included

Step 2: Add the Extract from File App

  1. Click the + button below the trigger.
  2. Search for Extract from File and select it.
  3. Choose Extract From ICS.
  4. Click Continue.
  5. Data (required): map the attachment URL from the Gmail trigger using @.
  6. Click Show Optional Fields and set the following:
FieldWhat to Enter
EncodingSelect UTF-8 for standard calendar files.
Strip BOMSet to True to remove any Byte Order Mark characters that might appear at the start of the file.
  1. Click Continue, then click Test. The output will show the extracted event fields, such as the event title, start time, end time, location, and description.
  2. Click Save.

Step 3: Create an Activity Record in Your CRM

  1. Click the + button below the Extract from File step.
  2. Search for your CRM app, for example, HubSpot, and select Create Activity or Log Meeting.
  3. Connect your CRM account.
  4. Map the extracted fields from the ICS step to the correct fields in your CRM:
CRM FieldMap From
Activity TitleMap the event title from the Extract from File output.
Start Date/TimeMap the event start time.
End Date/TimeMap the event end time.
NotesMap the event description.
LocationMap the event location if your CRM supports it.
  1. Click Test, then Save.
  2. Click Publish Workflow.

That is it. Now every time a client sends a calendar invite by email, the workflow reads the .ics file and logs the meeting details to your CRM automatically.

With the Extract from File App in OttoKit, you can pull clean, structured data from virtually any file format—HTML, CSV, Excel, PDFs, XML, JSON, calendars, and more. No manual parsing, no external converters, and no headaches.

Whether you’re automating reports, processing uploaded files, syncing data between systems, or building workflows that need to read and interpret file content, OttoKit makes extraction fast, flexible, and dead simple.

Start using the Extract from File App today and turn any file into usable workflow data in seconds.

Was this doc helpful?
What went wrong?

We don't respond to the article feedback, we use it to improve our support content.

Need help? Contact Support
Scroll to Top