Table of Contents
Step 1: Download
Click on the link below to download the Python version (v1.2.2) of the API as a zip file.
Note: This API code is compatible with Python 2.5 or later and JPype-0.5.4.
JPype library: JPype is an effort to allow python programs full access to java class libraries.
Step 2: Create Your First Job
Follow the steps below to set up your first crawl job through the API. Sample code for doing this can be found in the Sample Code section.
1. Download the 80legs Python API and place it in your application's class path.
2. Set the following parameters:
VersionId - Set to "1.0".
ApiToken - Set this to the value you obtained from the web portal after following the instructions in the Getting Started section.
Example:
auth_token = '[Your API Token]'
version = '1.0'
3. Instantiate an instance of PythonEightyLegsConnector using the apiToken and version (Set to "1.0") as the parameter.
Example:
connector = PythonEightyLegsConnector(version, auth_token)
4. Use the PythonEightyLegsConnector method to create jobs, retrieve jobs and call other methods specified in the next section.
Complete sample code for doing the above can be found in the Sample Code section.
Epydocs
The docs can be found at: Epydocs
Sample Code
Look at the examples folder in the downloaded api project.
Usage:
- See SampleJobGenerator.py for example on how to create JobSettings for different types of jobs
- See exampleCreateJob.py for example usage of create Job method.
- See exampleDownloadPostProcessedJobResultforJobMethods.py for example on how to download and post process result file for a job.
- See exampleMethods.py for example usage of all the other methods available.
- See exampleUserMethods.py for example usage of how to get user account and user information.
- See exampleCrawlPackageMethods.py for examples of how to get job information for a crawl package.
Check out the FAQ for answers to a wide variety of questions.
API Changelog
80legs periodically updates the API in order to deliver new features and to repair defects discovered in previous versions. In most cases, these changes will be transparent to API developers. However, occasionally we need to make changes that require developers to modify their existing applications. This page documents any changes made to the API that may effect your application. We recommend that API developers periodically check this list for any new announcements.
Version
|
Release Date
|
Description80legs API |
1.1.0
|
7/29/2009
|
Initial Release |
1.2.0
|
8/19/2010
|
New Feature
- Added access to eighty app, eighty app version and crawl packages.
- Added new value in OutgoingLinkType enumeration of LINKS_FROM_SAME_FULLY_QUALIFIED_DOMAIN_WITH_RESTRICTED_HOST. This value specifies crawl links from the same fully qualified domain for each URL in my seed list and treat "www.domain.com" and "domain.com" as different domains. More Details
- A new data attribute called isDonePostingResult has been added to Job Runs. Sometimes, it may take up to 10 minutes for all of the results to be posted. Previously, there was no way to determine whether all the results had been posted after a job has completed. If this field is set to 1, all the results have been posted for the job after it has completed. More Details
|
| 1.2.1 |
10/26/2010 |
Fix Bug
- Added interface method for getJobRunForCrawlPackage
- Fix a bug in the getJobRun() method
|
| 1.2.2 |
11/2/2010 |
Fix Bug
- Fix bug related to frequency Type. To print this use jobSummary.frequencyType.name
New Feature
- Added dateEnded for JobRuns
|
Resources
-
Python documentation conventions are discussed in Guido van Rossum's Python Style Guide: http://www.python.org/doc/essays/styleguide.html
Last Updated
The API was last updated on November 2nd, 2010.
Comments (0)
You don't have permission to comment on this page.