Table of Contents
Jobs
The following error codes may occur when your job is running.
J000
There was an unknown error with your job. Please submit a ticket and tell us your job ID as well as your best description of what you were trying to do. If you were trying to run a job in the sandbox environment, try running it in the live environment instead.
J100
80legs encountered run-time errors with your job. Please submit a ticket if you feel you set up your code correctly.
J101
Your job crawled more than 10,000 pages, and the % of pages crawled where code fails to load or there is a constructor or initialize error was > 1%.
If you're using an 80app provided by 80legs, it's possible this error was caused by a temporary problem with our system. In this case, we recommend trying to run your job again in a few hours.
J102
Your job crawled more than 10,000 pages, and the % of pages crawled where parseLinks() throws a security exception was > 1%.
J103
Your job crawled more than 10,000 pages, and the % of pages crawled where parseLinks() throws a general exception was > 25%.
J104
Your job crawled more than 10,000 pages, and the % of pages crawled where your constructor and/or initialize functions have a timeout, and 80legs successfully stopped the threads, was > 1%.
J105
Your job crawled more than 10,000 pages, and the % of pages crawled where parseLinks() and/or processDocument() functions have a timeout, and 80legs successfully stopped the threads, was > 10%.
J201
Your job analyzed more than 10,000 pages, and the % of pages analyzed where processDocument() throws a security exception was > 1%.
J202
Your job analyzed more than 10,000 pages, and the % of pages analyzed where processDocument() throws a general exception was > 25%.
J300
While your job was running, the number of times your constructor and/or initialize functions had timeouts where 80legs had to stop the JVM was > 10.
J400
While your job was running, the number of times parseLinks() and/or processDocument() had timeouts where 80legs had to stop the JVM was > 25.
J500
80legs could not find the code you wanted to run. This is most likely a problem within 80legs. Please submit a ticket if you feel you set up your code correctly.
J501
80legs could not find the seed list you wanted to use with your job.
J600
You chose to run keyword or regular expression matching and either didn't enter the expression data or provided an invalid expression.
Code Approval
The following error codes may occur when your code is being run through the approval process.
C000
This is a general error that should be encountered rarely. Please submit a ticket if you get this error.
C101
Your JAR was signed. Custom code should not be signed.
C102
You must use the latest version code when writing custom code. From time to time, we may make changes to the WebAnalysis class. A new version code will be provided with each change to the class, and custom code must use the newest version code.
C103
Your code was taking too long to complete. Code must finish running within 10 seconds.
C104
The JAR file was not found.
C200
initialize() was called with a general error. This error is a catch-all for any generic errors that occur with initialize().
C201
initialize() was called with IllegalArgumentException. Your code has illegal arguments for initialize().
C202
initialize() was called with NoSuchMethodError. Your code doesn't contain an implementation for initialize().
C300
parseLinks() was called with general error. This error is a catch-all for any generic errors that occur with parseLinks().
C301
parseLinks() was called with IllegalArgumentException. Your code has illegal arguments for parseLinks().
C302
parseLinks() was called with NoSuchMethodError. Your code doesn't contain an implementation for parseLinks().
C400
processDocument() was called with general error. This error is a catch-all for any generic errors that occur with processDocument().
C401
processDocument() was called with IllegalArgumentException. Your code has illegal arguments for processDocument().
C402
processDocument() was called with NoSuchMethodError. Your code doesn't contain an implementation for processDocument().
C500
getVersion() was called with general error. This error is a catch-all for any generic errors that occur with getVersion().
C501
getVersion() was called with IllegalArgumentException. Your code has illegal arguments for getVersion().
C502
getVersion() was called with NoSuchMethodError. Your code doesn't contain an implementation for getVersion().
C600
Your code cannot contain another JAR in it.
C700
The data used during your code approval could not be read.
Crawl Status
DNS_ERROR
The DNS could not resolve host name for this URL.
EXCEEDS_MAX_PAGE_SIZE
The page you were trying to crawl contains more data than your current subscription plan is allowed to download per page. You can upgrade your plan or contact us to discuss available options.
HTTPS_SKIP
Our crawler can crawl most https-encrypted pages, but on rare occassion, it cannot.
INVALID_URL
This URL is not formatted correctly and was not crawled.
MIME_TYPE_SKIP
The MIME type for this URL is not included in the MIME types you chose to crawl during your job.
NO_RESPONSE
This error typically means that the web server did not give any response to our request for the page (80legs tries multiple times to fetch each page). The remote server may not be functioning correctly at that moment.
ROBOTS.TXT_ERROR
Our crawler obeys the robots.txt specification and will not crawl pages that are blocked by a domain's robots.txt directives.
Other Codes
You may receive a three-digit numeric code in your crawl status. These codes correspond to the standard HTTP response status codes.
Process Status
The following codes are shown in your Crawled URLs results file for each page that was crawled.
GOOD
The page was analyzed with no problem.
NO_PROCESS
The page was not analyzed due to the analysis regular expression or a crawling error. If you received a robots.txt error, it is likely that the site you tried to crawl prohibits crawling this URL according to its robots.txt file.
NO_PROCESS_MIME_OR_ANALYSIS_REGEX
The page was not analyzed due to the analysis MIME type or analysis regex. The page was crawled successfully and was used in parseLinks().
PROCESS_TRUNCATED
The page was analyzed with no problem, but the results were over the limit. The results were trunctated to 1024 bytes.
PROCESS_EMPTY_PAGE
The page was analyzed successfully, but the document contents were empty (we allow processing empty documents because users might pull interesting information from the headers, status, or just the fact that it was empty).
PROCESS_RETURN_NULL
Your processDocument() method returned null for the page listed. A return value of null from processDocument() signals 80legs to not include that url in the analyzed results.
PARSE_SECURITY_EXCEPTION
The parseLinks() method threw a SecurityException. This is usually because the parseLinks() method attempted to do something that is not allowed, such as accessing the disk or making network requests.
PARSE_EXCEPTION
The parseLinks() method threw a general Exception.
PROCESS_SECURITY_EXCEPTION
The processDocument() method threw a SecurityException. This is usually because the processDocument() method attempted to do something that is not allowed, such as accessing the disk or making network requests.
PROCESS_EXCEPTION
The processDocument() method threw a general Exception.
PROCESS_INTERNAL_ERROR
This means that an 80legs internal error occurred when processing your document. Hopefully this never happens. :)
COMPUTE_TIMEOUT_GOOD
Your parseLinks() and processDocument() combined took longer than 30 seconds to finish, and 80legs was able to stop your code.
COMPUTE_TIMEOUT_BAD
Your parseLinks() and processDocument() combined took longer than 30 seconds to finish, and 80legs was unable to stop your code, which forced it to stop the JVM.
Comments (0)
You don't have permission to comment on this page.