Table of Contents
Step 1: Download
Click on the link below to download the Java version of the API as a JAR file.
NOTE: This API code is compatible with Java 6+. You can download Java 6 from here: http://java.com/en/download/ie_manual.jsp?locale=en&host=java.com
Step 2: Create Your First Job
Follow the steps below to set up your first crawl job through the API. Sample code for doing this can be found in the Sample Code section.
1. Download the 80legs API Jar and place it in your application's class path.
2. Instantiate an instance of APIProfile and set the following parameters:
VersionId - Set to "1.0".
ApiToken - Set this to the value you obtained from the web portal after following the instructions in the Getting Started section.
Example:
APIProfile profile = new APIProfile();
profile.setVersion("1.0");
profile.setApiToken("[Your API Security Token]")
3. Instantiate an instance of JavaEightyLegsConnector using the APIProfile as the parameter.
Example:
IEightyLegsConnector connector = new JavaEightyLegsConnector(profile);
4. Use the JavaEightyLegsConnector methods to create jobs, retrieve jobs and call other methods specified in the next section.
Complete sample code for doing the above can be found in the Sample Code section.
Javadocs
The Javadocs can be found at https://portal.80legs.com/api/apidocs/javaapidoc/.
You can start by looking at the docs for the class at com.eightylegs.customer.api.JavaEightyLegsConnector.
Details on how to add javadocs for a JAR file can be found at http://www.vogella.de/articles/Eclipse/article.html#classpath_jarjavadoc.
API Methods
The following methods are available from the API:
| Category |
Method
|
Response |
Description
|
| Job Methods |
createJob |
int JobID |
Creates a job in 80legs.
Deprecated:
public int createJob(JobSetting job) replaced by public int createJob(JobSetting job, boolean repeatForever)
|
| deleteJob |
N/A |
Deletes the job identified by the given job ID. |
| cancelJob |
N/A |
Cancels the job identified by the given job ID. |
| copyJob |
int JobID |
Copies an existing job that is identified by the given job ID. The new job is created with the specfied job name. |
| retrieveJobs |
ArrayList<JobSummary> |
Retrieves all jobs with the given status. |
| retrieveJobSetting |
JobSetting |
Retrieves the job settings that were used to create the job. |
| retrieveJobOverview |
JobOverview |
Retrieves the overview information that is related to the job. This includes job status as well as the latest job queue status. |
| retrieveJobRuns |
ArrayList<JobRun> |
Retrieves information for all runs for the given job. This includes the results of those runs as well. |
| retrieveRunResultInfo |
ArrayList<RunResult> |
Retrieves the run result information for the job run identified by the given ID. This information includes fields such as the name of the file and the type of the file.
New overloaded method!!! Retrieves the run result information for all the job runs that have the given status.
|
| downloadResult |
String Filename |
Downloads the result file and saves the file at the specified path.
New overloaded method in version 1.6!!! One new parameter added: int crawlPackageId
Downloads the job result files for jobs that exist in a crawl Package. Post processing is not done by this method.
|
| downloadPostProcessedResults (new) |
String Filename |
Downloads the job file and if the file is a code analysis result file, it reads the file and post processes the result. Need to provide eightyAppId which is available in the JobSetting object.
New overloaded method in version 1.6!!! Two new parameter added: int postProcessingCodeId, int packagedCrawlId
Downloads the job result files for jobs that exist in a crawl Package. If the file is a code analysis result file, it reads the file and post processes the result. Either a default post processing code is used or if the postProcessingCodeId is set, and if it is one of the post processing code allowed for the 80app used in the crawl package, the post processing is down using this code.
|
| Code Methods |
uploadCode |
int CodeID |
Uploads the code identified by the file and gives the filename that is used to identify the file.
New overloaded method!!!
public int uploadCode(File file, String name, int maxNodeHeapSpaceMB)
Uploads the code identified by the file and gives the filename that is used to identify the file and also allows for the user to specify the max node heap size required for the code to run.
|
| retrieveCodeByUser |
ArrayList<CodeFile> |
Retrieves all code information for the given user. |
| retrieveCodeByID |
CodeFile |
Retrieves a code information that is identified by the code ID. |
| downloadCode |
String Filename |
Downloads the code identified by the code ID to the file location provided by the file Path and names it with the given file name. |
| deleteCode |
N/A |
Deletes the code that is identified by the given ID. |
| Data Methods |
uploadData |
int DataID |
Uploads the data identified by the file and uses the data name to identify the data. |
| retrieveDataByUser |
ArrayList<DataFile> |
Retrieves all the data file information pertaining to the user. |
| retrieveDataByID |
DataFile |
Retrieves the data information associated to the data identified by the ID. |
| downloadData |
String Filename |
Downloads data identified by the given ID. |
| deleteData |
N/A |
Deletes the data file identified by the given ID. |
| Seed List Methods |
uploadSeedList |
int SeedlistID |
Uploads the seed list file and provides the file name that is used to identify the file.
Deprecated method:
public int uploadSeedList(File file, String seedListName, boolean ignoreBadURLs, String ignoreBadUrlMessage)
- Reason: Could not use String as a parameter by value.
public int uploadSeedList(File file, String seedListName, boolean ignoreBadURLs)
- Reason: does not return the bad url message
Suggested replacement for both of the above message.
- uploadSeedList(File file, String seedListName, boolean ignoreBadURLs, StringBuilder ignoreBadUrlMessage)
New overloaded method!!!
An overloaded method has been added which requires another parameter for validation messages. The method signature is: uploadSeedList(File file, String seedListName, boolean ignoreBadURLs, StringBuilder ignoreBadUrlMessage). If ignoreBadURLs is set to true and there are any urls that are bad, the seedlist will be added and the message of the bad urls will be returned in the ignoreBadUrlMessage string.
|
| retrieveSeedListByUser |
ArrayList<SeedlistFile> |
Retrieves all the seed list files information pertaining to the user. |
| retrieveSeedListByID |
SeedlistFile |
Retrieves seed list file information that is identified by the given ID. |
| downloadSeedList |
String Filename |
Allows for downloading of seed list that is identified by the seed list ID. |
| deleteSeedList |
N/A |
Deletes the seed list identified by the ID. |
| Account Methods |
retrieveAccountBalance |
AccountBalance |
Retrieves the user account balance information. |
| |
retrieveUserInformation |
User |
Retrieves the user information. |
| EightyApp (80app) methods (New) |
retrieveEightyAppById(int) |
EightyApp |
Retrieves EightyApp information that is identified by the given 80app Id. Provides the latest version that is public for the 80app. |
| |
retrieveEightyAppByVersionId |
EightyApp |
Retrieves EightyApp information that is identified by the given versionId. The EightyApp object has the version that is associated with the versionId. |
|
downloadResultFile |
String |
Downloads the result file at the given path and saves it using the job result name. The job can be a crawl package job, archived job or the users job. NOTE: Can be used for Crawl Package jobs or Crawl Package Archived Jobs.
|
|
Crawl Package
(New)
|
retrieveJobRunsForCrawlPackage |
List<JobRun> |
Retrieves all the job run information for the given job Id and crawl package Id. |
| |
retrieveJobsForCrawlPackage |
List<JobSummary> |
Retrieves all the job that are identified by the given status. If status is not provided and is null, all jobs are retrieved for the user.
|
| |
retrieveAvailableCrawlPackagesByUser |
List<CrawlPackage> |
Retrieves the available Crawl Packages that is for the user.
|
| Crawl Package Archived Jobs data |
retrieveCrawlPackageArchivedJobs |
List<JobSummary> |
Retrieves all the archived job that are identified by the given status. If status is not provided and is null, all archived jobs are retrieved for the user. |
| |
retrieveJobOverviewForCrawlPackage |
JobOverview |
Retrieves the overview information that is related to the archived job in a crawl package. This includes job status as well as how many pages were analyzed and crawled.
|
| |
retrieveJobRunsForCrawlPackageArchivedJob |
List<JobRun> |
Retrieves all the job run information for the given archived job Id.
|
| |
retrieveRunResultsInfoForAllNewResultsForCrawlPackageArchivedJobs |
List<RunResult> |
Retrieves all the new results for the user. |
Check out the Sample Code for samples on how to use the API.
Check out the FAQ for answers to a wide variety of questions.
API Changelog
80legs periodically updates the API in order to deliver new features and to repair defects discovered in previous versions. In most cases, these changes will be transparent to API developers. However, occasionally we need to make changes that require developers to modify their existing applications. This page documents any changes made to the API that may effect your application. We recommend that API developers periodically check this list for any new announcements.
Version
|
Release Date
|
Description |
1.0.1
|
7/29/2009
|
Initial Release |
1.0.2
|
3/12/2010
|
New Feature:
- 3/11: Add a parent exception class called EightyLegsCommonException. This can be used to catch all api exceptions.
Bug Fixes:
- 2/25: Fixed uploadSeedlist problem when validation message was being returned.
- 3/12: Fixed problem of not being able to create repeating jobs.
|
| 1.0.3 |
3/18/2010 |
New Feature
- 3/1: Added functionality to create a job that has an analysis method of EIGHTY_APP
Bug Fixes:
- 3/15: Fixed problem of not sending the data id when the job is using 80app as analysis method.
|
| 1.0.4 |
4/9/2010 |
New Feature:
- 4/8: Added Method: public JobOverview retrieveJobOverview(String jobName)
|
| 1.0.5 |
6/8/2010 |
New Feature
- 6/8: Added an overloaded uploadCode method.
- public int uploadCode(File file, String name, int maxNodeHeapSpaceMB) - Uploads the code identified by the file and gives the filename that is used to identify the file and also allows for the user to specify the max node heap size required for the code to run.
|
| 1.0.6 |
7/10/2010
|
New Feature:
- Bugs: Exceptions : EightyLegsParameterException and EightyLegsXmlException will inherit from EightyLegsCommonException.
- Bugs: Lists - API will use the interface java.util.List for its parameters.
- Adding crawl parameter of maxPagesPerDomain capability for a job. Note: Only certain SLA Plans can set this parameter.
|
| 1.0.7 |
7/22/2010
|
New Features:
- New methods that give access to jobs for crawl packages using the api.
- Post processed result files now have same name of the .80 result file
|
| 1.0.8 |
8/9/2010 |
Bug Fix:
- Fixed problem in job settings where job frequency interval was being shown incorrectly in the portal. This problem existed for api users too.
|
| 1.0.9 |
8/19/2010 |
New Feature:
- Added new option in OutgoingLinkType of LINKS_FROM_SAME_FULLY_QUALIFIED_DOMAIN_WITH_RESTRICTED_HOST.
- Added new field in job Runs called isDoneResultPosting. If this is set to 1, no more results will be available for the job run.
|
| 1.0.10 |
1/24/2011 |
New Feature:
- Added new functionality to retrieve archived jobs using the API.
|
| 1.0.15 |
5/25/2011 |
Moved to using Java built in HTTP URL Connection rather than sockets. Also resolved a problem when our server would not send header with response for upload seed list and upload code. |
| 1.0.16 |
8/11/2011 |
Bug Fix:
- Added for upload Data "invalid HTTP method" error
|
| 1.0.17 |
9/13/2011 |
Bug Fix:
- Returning latest JobRunStatusType in Job Summary when retrieveJobs(JobStatusType status) is being called.
|
| 1.0.18 |
5/15/2012 |
Bug Fix:
- Returning eightyAppVersionId with each RunResult information.
|
Last Updated
The API was last updated on May 15, 2012.
Comments (0)
You don't have permission to comment on this page.