Product releases 2016


#1

A simple changelog of the Import.io product for 2016.

##19 December 2016


[improvement] Upgraded database :sunny:
Upgraded one of our databases to improve the performance and security of the app.


[improvement] Upgraded the Editor to use HTTPS :sunny:
Now, the whole application works under HTTPS.


[bug fix] Unable to add more columns to an extractor :bug:
Fixed a problem that prevented from adding more columns to an extractor beyond a certain number.


[bug fix] URL Generator produces bad urls when user selects parameters in the main domain :bug:
Fixed a problem with the URL Generator.


[bug fix] JSON and JavaScript URLs that alter the HTML freeze Website view of Extractor. :bug:
Fixed a problem where pages using JSON or Javascript caused the Editor to freeze.


##18 December 2016


[bug fix] When pasting a URL on the new Extractor modal the Go button remains disabled :bug:
Fixed a problem where the Go button remained disabled when pasting a URL on the New Extractor modal.


##16 December 2016


[bug fix] Multiple crawl runs running in parallel :bug:
Fixed a problem that made multiple crawl runs to run in parallel.


[improvement] New status message after stopping a crawl run :sunny:
Added an improved message while the crawl run is stopping


[improvement] Improved query counter message for overages :sunny:
Improved the counter counter for a cleaner message when users run into overages.


[bug fix] Dashboard refresh makes it hard to see the preview :bug:
Fixed a problem that prevented the preview to be displayed correctly.


[bug fix] Download button is not available after stopping a crawl run :bug:
Fixed a problem that prevented the download button to display after stopping a crawl run.


[bug fix] Request premium trial twice :bug:
Fixed a problem that allowed users to request a premium trial multiple times.


[improvement] Improved Live Query API backend :sunny:
Improvements on the Live Query API backend.


[bug fix] Incorrect count for “No data extracted” URLs :bug:
Fixed a problem that incorrectly reported URLs that were successfully rendered but no data was extracted.


[bug fix] Preview only shows one row :bug:
Fixed a problem where the preview only showed one row for some URLs.


[improvement] Several backend improvements :sunny:
General improvements on our backend to increase the success rate of crawl runs.


##15 December 2016


[bug fix] Problem with downgrading on the new subscription page :bug:
Fixed a problem that prevented users from downgrading from the Billing and Subscription page.


[bug fix] Typo on the reset password form :bug:
Fixed a typo on the reset password form


##9 December 2016


[improvement] Crawl run time counter extended :sunny:
Added support for days on the crawl run time counter.


##6 December 2016


[improvement] Monthly query limit warning is now dismissible :sunny:
You can now dismiss the monthly query limit from the Dashboard.


[improvement] Improved error messages :sunny:
Improved error messages for unexpected errors.


[bug fix] Delete an extractor while running causes an inconsistent state :bug:
Fixed a bug that allowed to delete an extractor while running.


[new feature] Crawl runs now show successful and failed URLs count :new:
Crawl runs have now the information for both the successful and failed URL counts.


[improvement] Clicking on a link now opens on a new tab :sunny:
Clicking on a link now opens on a new tab


[bug fix] Null value is generated when empty URL’s parameters are user :bug:
Fixed a bug with the URL Generator when using empty values


[bug fix] Wrong URLs are generated when “step” field is populated with double values :bug:
Fixed a bug with the URL Generator when using double values on a step.


[bug fix] Run history show display 30 crawl runs :bug:
Fixed a bug that prevented the 30 crawl runs to display correctly.


[bug fix] URL count tooltip does not display correctly :bug:
Fixed a bug that prevented the URL count tooltip to display correctly.


##2 December 2016


[new feature] Timestamp on crawlruns :new:
Crawl runs have now the time information for when the crawl run started.


[bug fix] Query counter does not refresh :bug:
Fixed a bug that prevented the query counter from refreshing when the crawl run completed.


##1 December 2016


[improvement] Free users can start a crawl run even if they don’t have enough queries :sunny:
Free users can choose to start a crawl run and use all their queries even if they don’t have enough to run thorugh all the URLs


[bug fix] Scheduled extractor did not run :bug:
Fixed a bug that prevented scheduled extractors from running when scheduled.


##30 November 2016


[improvement] Query reset counter :sunny:
Improved query counter now displays the overages on a separate counter.


[improvement] Data preview improvements :sunny:
Data preview is now available as soon as the run has only 20 rows of data.


[new feature] Hourly schedules :new:
You can now schedule your extractor on an hourly basis.


[improvement] Extended extractors limit :sunny:
Up to 1,500 extractors are now visible on the product.


[improvement] General improvements on the URL Generator :sunny:
The URL Generator has now a smarter engine with better use of memory and CPU, makes the UI more responsive when generating large numbers of URLs.


[bug fix] Encoded values on the URL Generator :bug:
Fixed a bug where some characters where encoded by the URL Generator.


[bug fix] Dashboard crashes in Safari :bug:
Fixed a bug that prevented the app from working properly in Safari


[bug fix] Cannot logout :bug:
Fixed a bug that prevented from loggin out from the app

##29 November 2016


[improvement] New My Account page :sunny:
General improvement on the onboarding form.


[bug fix] Cannot reset password :bug:
Fixed a bug that prevented users from resetting their passwords.


##28 November 2016


[new feature] New My Account page :new:
A new My Account page now contains all your account with subscription and billing information.


[bug fix] Add URLs loses focus when adding a URL :bug:
Fixed a bug that made it hard to add new URLs on the Editor


[bug fix] Wrong redirect when page render fails :bug:
Fixed a bug that re-directed the user to the wrong place when the page could not be rendered.


[new feature] Enable/disable Get link URL for columns :new:
A new menu option on the columns allows to control if the url link should be included or not.


##4 November 2016


[bug fix] API Key resets itself :bug:
Fixed a bug that made the API Key to reset itself randomly.


[improvement] Improved crawl run history UI :sunny:
Moved to icons on the crawl run history for cleaner Dashboard.


[bug fix] Subscription query renewal dates are not in sync with billing dates :bug:
Fixed a bug were the query renewal dates where out sync with the billing renewal dates.


##3 November 2016


[bug fix] Data not displayed on the Website view when using XPath :bug:
Fixed a bug that prevented data from being displayed on the Website view when using XPath.


[bug fix] Wrong number of rows on my extractor runs :bug:
Fixed a bug with extractor runs showing an incorrect number of rows.


##31 October 2016


[bug fix] Cannot login using email with new domain extensions :bug:
Fixed a bug that prevented users from creating new accounts using an email address with a new domain extension.


[improvement] "Improved page user menu :sunny:
We made some minor improvements to the user menu.


##28 October 2016


[bug fix] Cannot login using email with new domain extensions :bug:
Fixed a bug that prevented from creating new accounts using an email address with a new domain extension.


[improvement] "Improved page rendering times :sunny:
We updated our backend systems to improve the time it takes to render a page, reducing both in training and extraction times.

##26 October 2016


[new feature] "Elapsed and estimated remaining time on Runs :new:
You can now see how much time is elapsed and an estimate on how much time is left for your Run to complete.


[improvement] "Duplicating an extractors no longer runs it by default :sunny:
When you duplicate an extractor it no longer starts running by default.


##19 October 2016


[bug fix] Cancel fails for runs with a very large number of URLs :bug:
Fixed a bug that prevented the cancel function to work properly for Runs with a very large number of URLs.


[bug fix] "Download not available for successful scheduled runs with a large number of URLs :bug:
Fixed a bug that prevented the download function to work properly for scheduled Runs with a very large number of URLs


##16 October 2016


[bug fix] Preview & Log files links are available when no files are available :bug:
Fixed a bug that enabled preview and log files links in some cases where no data was available for them.


[bug fix] “You are over your limit” banner wrongly displayed :bug:
Fixed a bug that made the “You are over your limit” banner display when it shouldn’t.


[improvement] “You are over your limit” banner wrongly displayed :sunny:
Onboarding login flow was reduced to a single screen to screamline the process.


##15 October 2016


[bug fix] Using links created via XPath or Regex no longer breaks chained extractors :bug:
Using url links created via XPath or Regex caused the chaining functionality to fail. The issue has been fixed.


##10 October 2016


[bug fix] Using manual XPath no longer breaks Data View :bug:
Using manual XPath made the Data View to stop working properly in some cases. The issue has been fixed.


##5 October 2016


[improvement] Improved scheduled runs logic :sunny:
If a scheduled run was stuck for any reason the following schedules would not run. We improved the logic for run monitoring to make sure runs don’t get stuck any more.


##29 September 2016


[bug fix] URL containing spaces now work :bug:
URLs containing spaces in their fragment (after the # sign) used to cause problems, in particular around the URL generator. The issue has been fixed.


##19 September 2016


[new feature] Notification when account suspended :new:
In the (rare) even that an account is suspended, the user will receive a notification at the next login.


##13 September 2016


[new feature] API access to JSON file :new:
The API now exposes the JSON for the latest run in addition to the CSV version. See the Integrate tab for your extractor.


[bug fix] URL generator deal properly with complex URLs :bug:
The URL generator now deals properly with URLs that contain a fragment, which is the part of the URL after a # sign.


##12 September 2016


[new quota] Modified the number of queries :bug:
We adjusted the number of queries for Trial and Free plans.


##7 September 2016


[bug fix] Fixed bug affecting simple chaining runs :bug:
We fixed a bug that preventing simple chaining runs to start.


[improvement] Better retry :sunny:
When retrying a page we do not use the version we have cached.


##6 September 2016


[bug fix] Restarting page rendering :bug:
We fixed a bug that caused the page rendering service to fail and not restart.


##5 September 2016


[bug fix] Some schedules runs got stuck :bug:
We fixed a bug that sometimes caused scheduled runs to get stuck.


[improvement] Better error reporting :sunny:
We split out the errors from the Webcache in the new crawler.


[bug fix] Fixed a site that could not load :bug:
We fixed a bug that affected a site which could not be extracted due to a Javascript issue.


##20 August 2016


[bug fix] CSV and JSON failing to load :bug:
We fixed a bug that prevented caused some large jobs to fail: even though the job succeeded, the CSV and JSON files failed to show as available for download. All fixed now!


##19 August 2016


[bug fix] Simple chaining failing to start :bug:
We fixed an issue that prevented some Simple Chaining jobs to start.


##18 August 2016


[bug fix] Correct number of queries and renewal date :bug:
We fixed a bug that caused the renewal date and the number of queries left in a plan to show up incorrectly.


##8 August 2016


[improvement] Better extractor names :sunny:
New extractors now have better default names, for example we remove the initial “www.”.


[new feature] Ability to download URLs :new:
When you need to copy all the URLs in the URL box to some other tool, we now provide the ability to download them as a simple text file. So even if you generate tens of thousands of URLs with the URL generator, you’ll get them all with one click.


[improvement] Removing link to dashboard in app :sunny:
For those of you using the app, we have removed the link to the dashboard which was confusing.


[improvement] Better search for extractor names :sunny:
We now provide full-text search on all extractor names, so typing “ik” will match “www.ikea.com/…”.


##28 July 2016


[bug fix] Lastet CSV API now handles files larger than 6MB :bug:
Resolved a bug where files larger than 6MB were producing the error [body size is too long] via the API. The fix means some small programmatic adjustments to be made depending on how you are calling the API. See this article for details.


##21 July 2016


[improved navigation] Link back to to Dashboard from My Data :sunny:
For users who were with us before April 2016 who have the Desktop app, there is now a link to switch back to the Web Extractor Dashboard in the top right hand corder of your My Data page. See here.

NB: The Web Extractor Dashboard does not work in the desktop app browser.


##12 July 2016


[new feature] Simple Chaining :new:
We have released a new way to feed an extractor: point to an existing extractor and chose one of its columns containing URLs. This makes it easier to “chain” extractors, where the output of one extractor provides the list of URLs for the next one.

Learn to use Simple Chaining with this tutorial.


##11 July 2016


[new feature] Shortcuts for regular expressions :new:
Regular expressions are very powerful, but not everyone is comfortable writing them. So we provide a number of shortcuts for common types like number, currency, URL… Try it and feel the power!


[new feature] CSV from last run :new:
In the Integrate tab for an extractor, it is now possible to get a URL that will download the CSV for the last successful run of this extractor.


[new feature] Google Sheets integration :new:
In the same location, one can also obtain a formula that can be pasted into Google Sheets to fetch the data from the last successful run of the extractor.


##7 July 2016


[improvement] New URL Generator :sunny:
We are proud to introduce a new URL Generator. It makes it possible to specify any part of a URL as a parameter. Rumor has it that it is now possible to generate hundreds of thousands of URLs without any the dashboard becoming unresponsive.

Learn to use Simple Chaining with this tutorial.


[improvement] More space for the extractor name :sunny:
Feel free to give your extractors meaningful names!


[new feature] Scheduling icon :new:
Scheduled extractors now display a small “clock” icon. Mouse over to see when the next run is scheduled.


##29 June 2016


[improvement] Crawl infrastructure update :sunny:
Many stability improvements including tweaks to memory usage, log rotation, improved deployment behaviour and autoscaling.


##22 June 2016


[new feature] Extracting “alt text” for an image :new:
We have added a feature that automatically extracts the “Alt text” for an image. This is useful when you want to get hold of data that is displayed in a visual format e.g. star ratings on Trip Advisor. The “alt text” is made available as an additional field / column in the JSON / CSV.


[bug fix] CSV download issues :bug:
Fixes a number of bugs that were causing issues with the CSV output (including missing Source URL column and row sorting issues)


##17 June 2016


[bug fix] CSV data was mixed up, now fixed :bug:
In some cases when a column had only a few pictures, the data could get misaligned among columns. Fixed.


##9 June 2016


[new feature] Duplicate extractors :new:
Extractors can now be copied. This makes it easy to copy and modify one of your existing Extractors, or even copy someone else’s Extractor into your own Dashboard.


[new feature] RSS feed :new:
RSS feed of your latest Extractor runs is now available via the Integrate tab inside Dashboard. RSS is useful, for example, for Zapier integration.


[new feature] HTML output support :new:
Advanced Web Extractor option: output raw HTML of the selected elements.


[new feature] Default column value support :new:
Advanced Web Extractor option: set a default value to use instead of outputting a blank text string when a column value is empty.


[new feature] Required column value support :new:
Advanced Web Extractor option: exclude a row from your dataset if a particular column value is empty.


[improvement] Blocking overlay removal
Web Extractor now allows you to select elements previously blocked by website overlays.


[improvement] Better Run history limit messaging
More informative messaging to clearly indicate the free plan limit on the number of items available inside Run history.


[bug fix] Save button disabled for empty dataset
Fixed an issue to do with the Web Extractor Save button not being disabled when a dataset is empty.

##8 June 2016


[improvement] Signup process
We improved the signup screens and added the ability to specify a country code for phone numbers.


[bug fix] Resetting password
We fixed an issue that prevented some users from resetting their password.


##7 June 2016


[improvement] Improved handling of groups
We have improved the handling of multiple groups in the extraction response.


[improvement] Improved handling of errors cases for logs
Improved handling of error cases for logs with bad urls, all errors now reported.


[bug fix] Correct row count
We fixed a bug where newline characters would cause an incorrect row count.


##6 June 2016

[improvement] Major update to our crawling infrastructure
Codenamed “Bees”, it uses a swarm of machines to handle your jobs a lot quicker, and will retry automatically.


[new feature] Results now include source URL and are sorted :new:
Nicer data: we now include the source URL as a separate column in the csv file, making it easier to collate and compare different datasets. The results are also sorted in the same order as the URLs in the original list.


[improvement] Better Web extractor
The Web extractor is now more efficient, and more resilient to small changes in a page.


[update] Improved information capture during sign up process
We have added the ability to capture more information about users upon signup so that we can better personalize the user experience.


##1 June 2016

[new feature] CSS Toggle :new:
Added the ability to disable CSS on the page to increase options to extract data. Useful when the data you need is not visible on the page. Tutorial page available.

[improvement] URL Generator list instant refresh
The generated URL list now refreshes on-th-fly as users edit any of the parameter inputs inside the URL Generator. This provides instant feedback on what the final URL list is going to contain. Previously, users would have to click off an input before they could see the updated list.


##31 May 2016

[improvement] Better website rendering
Various improvements made to the Web Extractor ensuring that websites are displayed as accurately as possible. These include:

  • better support for lazy-loading images
  • strategy to ensure complete loading of website content
  • improved rendering speed
  • better error handling and logging

##27 May 2016

[bug fix] Incomplete list of Extractors
Fixed an issue where previously only the latest 20 Extractors were displayed in the Dashboard.

##26 May 2016

[improvement] Quick tutorial: URL Generator
Link to a brief tutorial page on how to use the URL Generator now available via the Dashboard.

[improvement] Improved navigation within the app
E.g. “import.io” logos now links to appropriate home locations, depending on whether user is logged in or logged out.

##19 May 2016

[new feature] URL Generator :new:
Generate all the URLs you want to run the Web Extractor on with just one click. Based on the initial URL you created the Web Extractor with, “URL Generator” lets you easily tweak the parameters to create a list of URLs to run. This is particularly useful for defining page ranges, category lists, and also many other search scenarios.

[improvement] Web Extractor UI improvements
E.g., “Done” button now available in both website and data views.

[bug fix] Column navigation when there are no overflowing columns in the website view
Fixed an issue to do with redundant column navigation when there are no overflowing columns in the Web Extractor website view.

[bug fix] Incorrect plan cancellation data confirmation message
Fixed an issue to do with plan cancellation data confirmation message inside Dashboard.


##17 May 2016

[improvement] Better ad-blocking and analytics tracking-blocking support
Quicker extraction and better Web Extractor experience for end-users. Also, site-owners can be confident that Import.io traffic will have no impact on their analytics.

[improvement] Better JavaScript rendering
Various JS-related improvements made to the Web Extractor ensuring that websites are displayed as accurately as possible, so that users can always select the right data.

##06 May 2016

[improvement] Clearing URL input upon successful addition
URL input automatically cleared once a URL has been successfully added within the “ Add or manage URLs” modal when user creates/edits an Extractor.

[improvement] Better Extractor UI loading messages
More relevant Extractor UI loading messages being shown, e.g. when user edits an existing Extractor.

[bug fix] Error pasting list of URLs from spreadsheet
Fixed issues to do with URLs copied from spreadsheet (Google Sheets, Excel, etc) being corrupted within the URLs box inside Dashboard.

[bug fix] Manual XPath not picking up attribute values
Fixed an issue involving Manual XPath not picking up attribute values (e.g. /@href) in an Extractor.

[bug fix] Browser back button not working properly inside Dashboard
Fixed issues to do with the use of browser back button leading to slightly broken UI state inside the Dashboard.

[bug fix] Broken warning when trying to upgrade to an existing plan
Fixed a minor issue to do with broken warning modal when user attempts to upgrade to a plan they are already subscribing to.