Looters (instalooter.looters
)¶
Instagram looters implementations.
-
class
instalooter.looters.
HashtagLooter
(hashtag, **kwargs)[source]¶ Bases:
instalooter.looters.InstaLooter
A looter targeting medias tagged with a hashtag.
Create a new hashtag looter.
- Parameters
username (str) – the hashtag to search for.
See
InstaLooter.__init__
for more details about accepted keyword arguments.-
download
(destination, condition=None, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all medias passing
condition
to destination.- Parameters
destination (FS or str) – the filesystem where to store the downloaded files, as a filesystem instance or FS URL.
condition (function) – the condition to filter the medias with. If
None
is given, a function is created using theget_videos
andvideos_only
passed at object initialisation.media_count (int or None) – the maximum number of medias to download. Leave to
None
to download everything from the target. Note that more files can be downloaded, since a post with multiple images/videos is considered to be a single media.timeframe (tuple or None) – a tuple of two
datetime
objects to enforce a time frame (the first item must be more recent). Leave toNone
to ignore times.new_only (bool) – stop media discovery when already downloaded medias are encountered.
pgpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display page scraping progress.dlpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display file download progress.
- Returns
the number of queued medias.
May not be equal to the number of downloaded medias if some errors occurred during background download.
- Return type
-
download_pictures
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all the pictures to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only images.
-
download_videos
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all videos to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only videos.
-
get_post_info
(code)¶ Get media information from a given post code.
-
logged_in
()¶ Check if there’s an open Instagram session.
-
login
(username, password)¶ Log the instance in using the given credentials.
-
logout
()¶ Log the instance out from the currently opened session.
-
medias
(timeframe=None)¶ Obtain an iterator over the Instagram medias.
Wraps the iterator returned by
InstaLooter.pages
to seamlessly iterate over the medias of all the pages.- Returns
an iterator over the medias in every pages.
- Return type
-
class
instalooter.looters.
InstaLooter
(add_metadata=False, get_videos=False, videos_only=False, jobs=16, template='{id}', dump_json=False, dump_only=False, extended_dump=False, session=None)[source]¶ Bases:
object
A brutal Instagram looter that raids without API tokens.
Create a new looter instance.
- Parameters
add_metadata (bool) – Add date and comment metadata to the downloaded pictures.
get_videos (bool) – Also get the videos from the given target.
videos_only (bool) – Only download videos (implies
get_videos=True
).jobs (bool) – the number of parallel threads to use to download media (12 or more is advised to have a true parallel download of media files).
template (str) – a filename format, in Python new-style-formatting format. See the the Template page of the documentation for available keys.
dump_json (bool) – Save each resource metadata to a JSON file next to the actual image/video.
dump_only (bool) – Only save metadata and discard the actual resource.
extended_dump (bool) – Attempt to fetch as much metadata as possible, at the cost of more time. Set to
True
if, for instance, you always want the top comments to be downloaded in the dump.session (Session or None) – a
requests
session, orNone
to create a new one.
-
download
(destination, condition=None, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)[source]¶ Download all medias passing
condition
to destination.- Parameters
destination (FS or str) – the filesystem where to store the downloaded files, as a filesystem instance or FS URL.
condition (function) – the condition to filter the medias with. If
None
is given, a function is created using theget_videos
andvideos_only
passed at object initialisation.media_count (int or None) – the maximum number of medias to download. Leave to
None
to download everything from the target. Note that more files can be downloaded, since a post with multiple images/videos is considered to be a single media.timeframe (tuple or None) – a tuple of two
datetime
objects to enforce a time frame (the first item must be more recent). Leave toNone
to ignore times.new_only (bool) – stop media discovery when already downloaded medias are encountered.
pgpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display page scraping progress.dlpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display file download progress.
- Returns
the number of queued medias.
May not be equal to the number of downloaded medias if some errors occurred during background download.
- Return type
-
download_pictures
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)[source]¶ Download all the pictures to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only images.
-
download_videos
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)[source]¶ Download all videos to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only videos.
-
medias
(timeframe=None)[source]¶ Obtain an iterator over the Instagram medias.
Wraps the iterator returned by
InstaLooter.pages
to seamlessly iterate over the medias of all the pages.- Returns
an iterator over the medias in every pages.
- Return type
-
class
instalooter.looters.
PostLooter
(code, **kwargs)[source]¶ Bases:
instalooter.looters.InstaLooter
A looter targeting a specific post.
Create a new hashtag looter.
- Parameters
code (str) – the code of the post to get.
See
InstaLooter.__init__
for more details about accepted keyword arguments.-
download
(destination, condition=None, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)[source]¶ Download the refered post to the destination.
See
InstaLooter.download
for argument reference.Note
This function, opposed to other looter implementations, will not spawn new threads, but simply use the main thread to download the files.
Since a worker is in charge of downloading a media at a time (and not a file), there would be no point in spawning more.
-
download_pictures
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all the pictures to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only images.
-
download_videos
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all videos to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only videos.
-
get_post_info
(code)¶ Get media information from a given post code.
-
logged_in
()¶ Check if there’s an open Instagram session.
-
login
(username, password)¶ Log the instance in using the given credentials.
-
logout
()¶ Log the instance out from the currently opened session.
-
medias
(timeframe=None)[source]¶ Return a generator that yields only the refered post.
- Yields
dict – a media dictionary obtained from the given post.
- Raises
StopIteration – if the post does not fit the timeframe.
-
class
instalooter.looters.
ProfileLooter
(username, **kwargs)[source]¶ Bases:
instalooter.looters.InstaLooter
A looter targeting medias on a user profile.
Create a new profile looter.
- Parameters
username (str) – the username of the profile.
See
InstaLooter.__init__
for more details about accepted keyword arguments.-
download
(destination, condition=None, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all medias passing
condition
to destination.- Parameters
destination (FS or str) – the filesystem where to store the downloaded files, as a filesystem instance or FS URL.
condition (function) – the condition to filter the medias with. If
None
is given, a function is created using theget_videos
andvideos_only
passed at object initialisation.media_count (int or None) – the maximum number of medias to download. Leave to
None
to download everything from the target. Note that more files can be downloaded, since a post with multiple images/videos is considered to be a single media.timeframe (tuple or None) – a tuple of two
datetime
objects to enforce a time frame (the first item must be more recent). Leave toNone
to ignore times.new_only (bool) – stop media discovery when already downloaded medias are encountered.
pgpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display page scraping progress.dlpbar_cls (type or None) – an optional
ProgressBar
subclass to use to display file download progress.
- Returns
the number of queued medias.
May not be equal to the number of downloaded medias if some errors occurred during background download.
- Return type
-
download_pictures
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all the pictures to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only images.
-
download_videos
(destination, media_count=None, timeframe=None, new_only=False, pgpbar_cls=None, dlpbar_cls=None)¶ Download all videos to the provided destination.
Actually a shortcut for
download
withcondition
set to accept only videos.
-
get_post_info
(code)¶ Get media information from a given post code.
-
logged_in
()¶ Check if there’s an open Instagram session.
-
login
(username, password)¶ Log the instance in using the given credentials.
-
logout
()¶ Log the instance out from the currently opened session.
-
medias
(timeframe=None)¶ Obtain an iterator over the Instagram medias.
Wraps the iterator returned by
InstaLooter.pages
to seamlessly iterate over the medias of all the pages.- Returns
an iterator over the medias in every pages.
- Return type
-
pages
()[source]¶ Obtain an iterator over Instagram post pages.
- Returns
an iterator over the instagram post pages.
- Return type
- Raises
ValueError – when the requested user does not exist.
RuntimeError – when the user is a private account and there is no logged user (or the logged user does not follow that account).