< BID User's Guide >

7. BID Advanced Configuration

Click on the "Advanced Configuration" button on the BID Configuration tab to access the BID Advanced Configuration.

Images



BID Advanced Configuration - Images

Non thumbnailed image types

Controls which types of image are considered to be "embedded" (see Image Type Filters). Uncheck the image types that you'd like BID to ignore when processing embedded images.

Delete images smaller than

Enable this option to have BID automatically delete small image files after downloading. This is useful to remove banner adverts and other unwanted small image page elements.

Do not download images larger than

Enable this option to prevent BID from downloading any images that are larger than this specified size. This is only possible if the image host makes the image size available to BID when it begins downloading.

Save images with original server date/time is available

BID will set the downloaded image file's "last modified" date and time to the date and time provided by the image host (if available). This can be useful if you need to sort downloaded images by date after downloading.

Saves images using image title as file name

If enabled BID will use the thumbnail's image "title" or "alt" tag as file name when saving the full sized image.


General



BID Advanced Configuration - General

Cache Control

BID caches files automatically when downloading - this allows for broken downloads to be resumed from the point of failure and has other performance benefits. The cache can be configured to be cleared on exit or kept for a specified number of days. Click on the "Clear Cache and Temp Files Now" button to delete all cached files immediately.

Prompt to save batch on exit

If enabled, BID will pop up a warning dialog on closing if there are still files to be downloaded.

Cancel download if free disk space below

BID will automatically cancel a download if disk space drop below the selected value.

Save gallery URL shortcut in download folder

If enabled, BID will create an internet shortcut to the gallery page in the download folder.

Disable Unicode characters in file and folder names

If enabled, BID will replace any Unicode (non ASCII) character with an underscore.

Disable "Close when download complete" when downloading selected images

If enabled, BID will automatically uncheck the "close when download complete" checkbox when the "Download selected" download option is used.

Generated file names numeric padding

Controls the zero padding size of the generated numbers used when the "Generate filenames" option is being used.


Advanced Configuration Lists and Regular Expressions


When scanning web pages BID uses a special internal "scoring" system to determine which links are thumbnailed images and which are full sized images. All other links types are ignored.

This works very well for the majority of web galleries, but occasionally you may come across a gallery that BID cannot download from, or where incorrect links are identified as thumbnailed image links.

By configuring the "Ignore List", "Include List", "Redirect Links" and "JavaScript Sites" settings discussed next, you may be able to successfully download such problem galleries.

Note that some web sites go to extreme lengths to block the use of automated image downloaders such as BID. Because of this it may not always be possible to configure BID to download from such sites. However, if you come across such a problem site please contact us and send us the details. If there's enough demand we can investigate supporting the site as a "special case" in a future release of BID.


BID Advanced Configuration - Ignore List

Regular Expressions

The Ignore List, Include List and Redirect Links are lists of "regular expressions", one per line.

A regular expression is a specially formatted text string used for pattern matching.

The following characters have special meanings when used in regular expressions and must be prefixed by a backslash (\) if you wish to use them as literals in your regular expression.

[, \, ^, $, ., |, ?, *, +, (, ), /

^  = start of line
$  = end of line
.  = match any character
\x = use literal character x

For a detailed explanation of regular expressions please visit http://wikipedia.org/wiki/Regular_expression

Some examples:

The regular expression test matches any link containing the word "test" such as
"http://website.com/test123/index.htm" or "http://example.com/images/test_image.jpg"

The regular expression _th\.jpeg$ matches any link that ends with "_th.jpeg", such as "http://mysite.com/thumbs/car_th.jpeg" or "http://anothersite.com/thumbpic_th.jpeg".

Notice the use of the backslash to indicate a literal dot, and the dollar sign indicating the end of the line.

Simplified Regular Expressions

As most people find regular expressions hard to understand we've added support for what we call "simplified" regular expressions. These are of the form:
<must contain this text> or <!must not contain this text>

The characters within the angle brackets are treated as literals - no backslashes are required to "escape" any characters.

e.g. If we want to match on any URL that contained the string "/index.php?id=" we would use:
</index.php?id=>

These can be combined together as often as required.

e.g. We want to match all links that contain both "/index.php?id" and "gal=1" but do not contain "/advert":
</index.php?id=><gal=1><!/advert>

A normal regular expression can be combined with these simplified expressions if required.

e.g. We want to match all links that contain both "/index.php?id" and "gal=1", do not contain "/advert", and end with ".php"
</index.php?id=><gal=1><!/advert>\.php$


Ignore List


The Ignore list is used by BID to filter out links that may be misidentified as thumbnailed image links, or full sized images. After downloading a page, BID will automatically try to match all the links found against this list of regular expressions. Any matching links are discarded.

Entries in the "include list" can override the "ignore list" and cause items to be included when they should be ignored. To force BID to apply your "ignore" rule, prefix it with two asterisks (**).

This can be useful if you find that BID keeps incorrectly identifying a particular type of link (such as a banner advert) as an image, or if BID keeps downloading the wrong full sized image from a web page. Simply add a regular expression that matches the unwanted links and BID will ignore it.


Include List


If BID finds no images when scanning a web gallery page you can force it detect the images by adding regular expressions that match the full sized image links.

Suppose all the full sized image links look like this:
http://trickysite.com/pics/large/<random number>/

To force BID to detect these as valid full sized image links you would add the following regular expression to the "Include list":
<trickysite.com/pics/large/>

This would match any link that contains "trickywebsite.com/pics/largesize".

Some full sized image pages only display "medium sized" images and have links to pages containing larger or original sized images. To force BID to follow such links add a regular expression matching this link to the include list prefixed with an asterisk (*). This also works for cases where BID does not find the the correct full sized image on the page.

For example, suppose our "full page" link contains a link to the original sized image that looks like:
http://example.com/pics/fullsized/(name).jpg

We would add the following line to the include list:
*<example.com/pics/fullsized/>

Some web sites use non standard methods to generate thumbnails. To help find these thumbnails on a web page add a matching regular expressions prefixed by 2 asterisks (**). For example:
**<thumbnailgenerator.php?id=>


Redirect Links


Many web galleries link the thumbnailed images to redirection "services" that display pop up adverts before redirecting to the full sized image page. BID automatically tries to resolve these redirected links when it finds them so that it can download directly from the original image page.

A nice by product of this redirection resolution is that you can then use the "Export Gallery" function to export a clean, redirection free gallery that can be pasted into forums or web pages.

If you come across a gallery that is redirected through a redirection service that BID is unaware of, simply add a matching regular expression to this list.

For example, suppose all gallery thumbnail links look like this:
http://newultracashimagebucks.org/<random number>

To force BID to resolve these links you would add the line
<newultracashimagebucks.org>
to the list.


JavaScript Sites


Some web sites generate their content dynamically using JavaScript. This may prevent BID from detecting any images when scanning gallery pages of such sites. To force BID to process the JavaScript for such sites add regular expressions that match the web site domain to this list. Note that doing this can greatly slow down page loading and processing.

If the expression matches a full sized image page, downloading can be sped up by prefixing the expression with a single asterisk (*).

Some web sites display galleries using an "endless scrolling" method. There are no "next" and "prior" page links, and thumbnailed images are loaded dynamically as you scroll down the page. To make BID "auto scroll" such pages, prefix the expression with two asterisks (**).


Multi Page Galleries


If BID is failing to load multiple pages from a gallery, you can add a regular expression to help BID find the "next page" link. In place of the actual page number use the sub expression (\d+), e.g.
website\.com/galleries/(*.)/page(\d+)\.php$

If the web site starts at page 0 instead of 1, prefix the expression with a single asterisk (*).


Folders



BID Advanced Configuration - Folders

Standard BID folders can be changed here if required. The Queue Folder is used by BID to save a batch file before it is queued to via "Add to Queue" function. The "Retry Batch Folder" is where BID automatically saves batches containing images that failed to download. These files are listed in the Queue Manager and can be re-queued if required. The "Log File Folder" is where BID keeps its log file.


Delays



BID Advanced Configuration - Delays

Some web sites may block you from downloading temporarily if you download too much data in a short amount of time. To work around this you can add expressions that match such web sites to the "Delay" list. Whenever BID downloads from a site that's on this list it will limit the number of download threads to 1, and pause for the specified number of seconds after each download.


<  Previous Next  >