Skip to content

pocesar/apify-login-session

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Login Session

Get localStorage, sessionStorage and cookies from logins for usage in other actors.

Usage

This actor can help you (re)use logged in sessions for your website and serivces, abstracting away the need for developing your own login mechanism. It uses a named session storage, so you when you request a new session, it will be readily available. It's tailored to work seamlessly on Apify platform and other actors.

It's more low-level than other actors, but it tries to cover the most common use cases like:

  • Single Page Applications
  • Server Side Rendered websites
  • Ajax login calls
  • Multi step logins such as username on one page, password on another

You may call directly from your actor, or use the INPUT.json to create a task or scheduled to keep seeding your session storage with new sessions. It cannot deal with 2FA or captchas (yet).

// in your actor
const storageName = 'session-example';

const call = await Apify.call('pocesar/login-session', {
    username: 'username',
    password: 'password',
    website: [{ url: 'http://example.com' }], // the RequestList format
    cookieDomains: [
        "http://example.com"
    ],
    sessionConfig: {
        storageName,
        maxAgeSecs: 3600,
        maxUsageCount: 10,
        maxPoolSize: 120
    },
    steps: [{
        username: {
            selector: "input#email", // the input that receives the username
            timeoutMillis: 10000 // optional timeout in ms
        },
        password: {
            selector: "input#password" // the input that receives the password
        },
        submit: {
            selector: "input[type=\"submit\"]", // the button that executes the login
        },
        failed: {
            selector: "[role=\"alert\"],#captcha", // usually an error that tells the login failed
            timeoutMillis: 10000 // optional timeout in ms
        },
        waitForMillis: 15000 // optional "sleep" in ms to consider the page as "settled"
    }]
});

const { session, error } = call.output;

// if it fails, the error will be filled with something
// otherwise, the session will have the Session parameters, that can be
// instantiated manually using `new Apify.Session({ ...session, sessionPool })`

// load the session pool from the storage, so it has our new
// session. this might change in the future
const sessionPool = await Apify.openSessionPool({
    persistStateKeyValueStoreId: storageName
});

const sessionJustCreated = sessionPool.sessions.find(s => s.id === session.id);

/**
 * the complete Cookie string for usage on the header
 */
sessionJustCreated.getCookieString('http://example.com');

/**
 * contains the User-Agent used for the login request.
 * the same userAgent must be set between uses so there's no
 * conflict and blocks. Set this as your User-Agent header
 **/
sessionJustCreated.userData.userAgent;

/**
 * the proxyUrl used, can be empty.
 * Set this as your proxyUrl parameter in crawlers.
 *
 * This might be undefined if you didn't use any proxies
 */
sessionJustCreated.userData.proxyUrl;

/**
 * object containing any sessionStorage content, useful for JWT tokens.
 * Useful for using in PuppeteerCrawler
 */
sessionJustCreated.userData.sessionStorage;

/**
 * object containing any localStorage content, useful for JWT
 * tokens. Useful for using in PuppeteerCrawler
 */
sessionJustCreated.userData.localStorage;

Login locally then use the session on the platform

You can login locally, executing the login-session actor on your machine, make sure you're logged in to the platform using apify login and using forceCloud input option, like this:

{
    "username": "username",
    "password": "s3cr3tp4ssw0rd",
    "website": [{ "url": "https://example.com/" }],
    "sessionConfig": {
        "storageName": "example-login-sessions" // need to use this
    },
    "steps": [
        {
            "username": { "selector": "#email" },
            "password": { "selector": "#password" },
            "submit": { "selector": "input[type=\"submit\"]" },
            "success": { "selector": ".main-menu", "timeoutMillis": 10000 },
            "failed": {
                "selector": ".login.error",
                "timeoutMillis": 10000
            },
            "waitForMillis": 30000
        }
    ],
    "cookieDomains": ["https://example.com"],
    "proxyConfiguration": {
        "useApifyProxy": false
    },
    "forceCloud": true // this forces the login to be saved on platform Storage (https://my.apify.com/storage#/keyValueStores)
}

Place this in your apify_storage/key_value_stores/default/INPUT.json file, then run locally:

$ apify run --purge

The session will be created in your Apify platform account, under the storageName you provided, but using your local IP. Using this, you're able to avoid PIN requests and security checkpoint screens.

Input Recipes

Here are some real-life examples of INPUT.json that you may use:

Gmail

{
    "username": "username",
    "password": "password",
    "website": [{ "url": "https://accounts.google.com/signin/v2/identifier?service=mail&passive=true&flowName=GlifWebSignIn&flowEntry=ServiceLogin" }],
    "cookieDomains": [
        "https://mail.google.com",
        "https://accounts.google.com",
        "https://google.com"
    ],
    "steps": [{
        "username": {
            "selector": "#identifierId"
        },
        "submit": {
            "selector": "#identifierNext"
        },
        "success": {
            "selector": "input[type=\"password\"]",
            "timeoutMillis": 10000
        },
        "failed": {
            "selector": "#identifierId[aria-invalid=\"true\"],iframe[src*=\"CheckConnection\"]"
        },
        "waitForMillis": 30000
    }, {
        "password": {
            "selector": "input[type=\"password\"]"
        },
        "submit": {
            "selector": "#passwordNext",
            "timeoutMillis": 15000
        },
        "failed": {
            "selector": "input[type=\"password\"][aria-invalid=\"true\"],iframe[src*=\"CheckConnection\"]",
            "timeoutMillis": 5000
        },
        "success": {
            "selector": "link[href*=\"mail.google.com\"]",
            "timeoutMillis": 10000
        },
        "waitForMillis": 30000
    }]
}

Facebook

{
    "username": "username",
    "password": "password",
    "website": [{ "url": "https://www.facebook.com/" }],
    "cookieDomains": [
        "https://facebook.com"
    ],
    "steps": [{
        "username": {
            "selector": "#login_form [type=\"email\"]"
        },
        "password": {
            "selector": "#login_form [type=\"password\"]"
        },
        "submit": {
            "selector": "#login_form [type=\"submit\"]"
        },
        "success": {
            "selector": "body.home",
            "timeoutMillis": 10000
        },
        "failed": {
            "selector": "body.login_page,body.UIPage_LoggedOut",
            "timeoutMillis": 10000
        },
        "waitForMillis": 30000
    }]
}

Twitter

{
    "username": "username",
    "password": "password",
    "website": [{ "url": "https://twitter.com/login" }],
    "cookieDomains": [
        "https://twitter.com"
    ],
    "steps": [{
        "username": {
            "selector": "h1 ~ form [name=\"session[username_or_email]\"]",
            "timeoutMillis": 2000
        },
        "password": {
            "selector": "h1 ~ form [name=\"session[password]\"]",
            "timeoutMillis": 2000
        },
        "submit": {
            "selector": "h1 ~ form [role=\"button\"][data-focusable]"
        },
        "success": {
            "selector": "h2[role=\"heading\"]",
            "timeoutMillis": 10000
        },
        "failed": {
            "selector": "h1 ~ form [role=\"button\"][disabled]",
            "timeoutMillis": 10000
        },
        "waitForMillis": 30000
    }]
}

Instagram

{
    "username": "username",
    "password": "password",
    "website": [{ "url": "https://instagram.com" }],
    "cookieDomains": [
        "https://www.instagram.com"
    ],
    "steps": [{
        "username": {
            "selector": "input[name=\"username\"]",
            "timeoutMillis": 10000
        },
        "password": {
            "selector": "input[name=\"password\"]",
            "timeoutMillis": 10000
        },
        "submit": {
            "selector": "button[type=\"submit\"]"
        },
        "success": {
            "selector": "img[alt=\"Instagram\"]",
            "timeoutMillis": 10000
        },
        "failed": {
            "selector": "#slfErrorAlert",
            "timeoutMillis": 5000
        },
        "waitForMillis": 30000
    }]
}

LinkedIn

{
    "username": "username",
    "password": "password",
    "website": [
        {
            "url": "https://www.linkedin.com/login?fromSignIn=true&trk=guest_homepage-basic_nav-header-signin"
        }
    ],
    "steps": [
        {
            "username": {
                "selector": "#username"
            },
            "password": {
                "selector": "#password"
            },
            "submit": {
                "selector": ".login__form_action_container button"
            },
            "success": {
                "selector": ".authentication-outlet,.launchpad-cp-enabled",
                "timeoutMillis": 15000
            },
            "failed": {
                "selector": ".form__input--error,.login__form,.pin-verification-form",
                "timeoutMillis": 15000
            },
            "waitForMillis": 30000
        }
    ],
    "cookieDomains": ["https://www.linkedin.com"]
}

Related Content

Caveats

  • Apify proxy sessions can last at most 24h, so never set your maxAgeSecs greater than this number
  • If the proxy fails, the login fails. If the proxy is banned, the login fails.

Example

Example form is in https://now-h3p8398gc.now.sh

License

Apache 2.0