View
 

80legs API - Python Version

Page history last edited by Aliya 1 year, 3 months ago

Table of Contents


 

Step 1:  Download


Click on the link below to download the Python version (v1.2.2) of the API as a zip file.

 

Download
Now

 

Note: This API code is compatible with Python 2.5 or later and JPype-0.5.4. 

 

JPype library: JPype is an effort to allow python programs full access to java class libraries.

 

Step 2:  Create Your First Job


Follow the steps below to set up your first crawl job through the API.  Sample code for doing this can be found in the Sample Code section.

 

     1.      Download the 80legs Python API and place it in your application's class path.

      2.     Set the following parameters:

VersionId - Set to "1.0".

ApiToken - Set this to the value you obtained from the web portal after following the instructions in the Getting Started section.

 

Example:

                              auth_token = '[Your API Token]'

                              version = '1.0'

 

     3.     Instantiate an instance of PythonEightyLegsConnector using the apiToken and version (Set to "1.0") as the parameter. 

Example:

connector = PythonEightyLegsConnector(version, auth_token)
 

     4.     Use the PythonEightyLegsConnector method to create jobs, retrieve jobs and call other methods specified in the next section.

 

Complete sample code for doing the above can be found in the Sample Code section. 

 

Epydocs


The docs can be found at:  Epydocs 

 

Sample Code


Look at the examples folder in the downloaded api project.

Usage:

  •     See SampleJobGenerator.py for example on how to create JobSettings for different types of jobs
  •     See exampleCreateJob.py for example usage of create Job method.
  •     See exampleDownloadPostProcessedJobResultforJobMethods.py for example on how to download and post process result file for a job.
  •     See exampleMethods.py for example usage of all the other methods available.
  •     See exampleUserMethods.py for example usage of how to get user account and user information.
  •     See exampleCrawlPackageMethods.py for examples of how to get job information for a crawl package.

 

FAQ


Check out the FAQ for answers to a wide variety of questions.

 

API Changelog


80legs periodically updates the API in order to deliver new features and to repair defects discovered in previous versions. In most cases, these changes will be transparent to API developers. However, occasionally we need to make changes that require developers to modify their existing applications.  This page documents any changes made to the API that may effect your application. We recommend that API developers periodically check this list for any new announcements.

 

Version
Release Date 
Description80legs API
1.1.0
7/29/2009
Initial Release
1.2.0
8/19/2010

New Feature 

  • Added access to eighty app, eighty app version and crawl packages
  • Added new value in OutgoingLinkType enumeration of LINKS_FROM_SAME_FULLY_QUALIFIED_DOMAIN_WITH_RESTRICTED_HOST. This value specifies crawl links from the same fully qualified domain for each URL in my seed list and treat "www.domain.com" and "domain.com" as different domains. More Details
  • A new data attribute called isDonePostingResult has been added to Job Runs.  Sometimes, it may take up to 10 minutes for all of the results to be posted. Previously, there was no way to determine whether all the results had been posted after a job has completed. If this field is set to 1, all the results have been posted for the job after it has completed.  More Details    

 

1.2.1 10/26/2010

Fix Bug

  • Added interface method for getJobRunForCrawlPackage
  • Fix a bug in the getJobRun() method
1.2.2 11/2/2010

Fix Bug

  • Fix bug related to frequency Type.  To print this use jobSummary.frequencyType.name 

New Feature

  • Added dateEnded for JobRuns

 

Resources


  1. Python documentation conventions are discussed in Guido van Rossum's Python Style Guide: http://www.python.org/doc/essays/styleguide.html

 

Last Updated


The API was last updated on November 2nd, 2010.

 

 

Comments (0)

You don't have permission to comment on this page.