Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter by date doesn't seem to be working for me #2246

Open
bialyrycerz opened this issue May 7, 2024 · 6 comments
Open

Filter by date doesn't seem to be working for me #2246

bialyrycerz opened this issue May 7, 2024 · 6 comments
Labels
question Question

Comments

@bialyrycerz
Copy link

I am using the construct instaloader --post-filter="date_utc <= datetime(2018, 5, 31)" target from the documentation, but instaloader seems to be ignoring that and downloading all posts anyway.

Using the command line:
instaloader +/Users/user/Documents/iloader.txt

iloader.txt contains:

--filename-pattern={date_utc:%Y-%m-%d_%H-%M-%S}_{mediaid}_{owner_id}
--no-captions
--no-videos
--post-filter="date_utc >= datetime(2024,1,1)"
username

Based on the documentation, I would've expected this to download posts after 1/1/2024, but it's downloading everything.
It is the first time downloading this user; does that supersede any filter commands?
Or, is there something incorrect I'm doing here?
Thanks

@bialyrycerz bialyrycerz added the question Question label May 7, 2024
@samadbek-e
Copy link

samadbek-e commented May 9, 2024

So you want to scrape the data before datetime(2018, 5, 31), yeah?

@bialyrycerz
Copy link
Author

After. Wouldn't date_utc >= datetime(2024,1,1) scrape posts after 1/1/2024? Or am I misunderstanding something?

@giovsta
Copy link

giovsta commented May 15, 2024

I did date filtering like this:

profile_posts = instaloader.Profile.from_username(L.context, username).get_posts()

SINCE = datetime(2023, 1, 1)  # further from today, included
print("Downloading since: ", SINCE)

UNTIL = datetime(2024, 1, 1)  # closer to today, not included
print("Downloading until: ", UNTIL)
    
for post in profile_posts:
    postdate = post.date
    time_stamp = str(postdate)
    if postdate > UNTIL:
        print("Ain't there yet! The post is from " + time_stamp)
        continue
    elif postdate < SINCE:
        print("I may have gotten too far: the post is from " + time_stamp)
        break
    else:
        print("Downloading from " + time_stamp)
        L.download_post(post, username)

I haven't tested it right now, but a few weeks ago it worked, let me know if this helped!

@bialyrycerz
Copy link
Author

Thank you -- I am trying to do this via the command line option, though...but maybe that won't work.

@bialyrycerz
Copy link
Author

Update: I am finding that --post-filter seems to work when I use it directly on the command line, but not from an args file.

So:
instaloader --post-filter="date_utc > datetime(2024, 1, 1)" --filename-pattern={date_utc:%Y-%m-%d_%H-%M-%S}_{mediaid}_{owner_id} <target> skips earlier posts as expected. But using it in the args file I posted originally does not.

@makako-dev
Copy link

@bialyrycerz OK I was having the same issue and I got it working. On your args files, you need to remove the quotes from the conditional expression, such as:

--filename-pattern={date_utc:%Y-%m-%d_%H-%M-%S}{mediaid}{owner_id}
--no-captions
--no-videos
--post-filter=date_utc >= datetime(2024,1,1)
username

Now my only question is, if the posts are "skipped", does this count against our rate limit?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question
Projects
None yet
Development

No branches or pull requests

4 participants