Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up read_env_file() by 21% in embedchain/utils/cli.py #1260

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

misrasaurabh1
Copy link
Contributor

Description

📄 read_env_file() in embedchain/utils/cli.py

📈 Performance went up by 21% (0.21x faster)

⏱️ Runtime went down from 842.51μs to 698.00μs

Explanation and details

(click to show)

Here is an optimized version of the function. I used an compiled regular expression for better performance and file.readlines() for faster file reading. Here is the code:

This code could be even faster for larger files as it reads the file all at once and uses the compiled regular expression to match lines. It also reduces function calls in the loop, which tend to be expensive, and uses native Python operations.

Type of change

Please delete options that are not relevant.

  • Refactor (does not change functionality, e.g. code style improvements, linting)

How Has This Been Tested?

The new optimized code was tested for correctness. The results are listed below.

✅ 9 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import pytest
import os
import re
from tempfile import NamedTemporaryFile
from embedchain.utils.cli import read_env_file
# helper function to create a temporary .env file
def create_temp_env_file(content):
    temp_file = NamedTemporaryFile(mode='w', delete=False)
    temp_file.write(content)
    temp_file.close()
    return temp_file.name

# unit tests
@pytest.fixture
def valid_env_file():
    content = "DB_HOST=localhost\nDB_PORT=5432\nAPI_KEY=abc123"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_comments_and_empty_lines():
    content = "# This is a comment\n\nDB_USER=user"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_whitespace():
    content = "  DB_PASS = secret  \n\tAPI_SECRET=xyz\t"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_special_characters():
    content = 'SECRET_KEY="my$ecretK3y"\nGREETING=Hello, World!'
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_no_value():
    content = "EMPTY_VALUE="
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_no_equals():
    content = "SOME_INVALID_LINE"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_multiple_equals():
    content = "COMPLEX_VALUE=some=value=with=equals"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_duplicate_keys():
    content = "DUPLICATE_KEY=first\nDUPLICATE_KEY=second"
    return create_temp_env_file(content)

@pytest.fixture
def env_file_with_unicode():
    content = "UNICODE_KEY=ünîcødé\nEMOJI=🚀"
    return create_temp_env_file(content)

def test_valid_env_file(valid_env_file):
    # Test a valid .env file
    env_vars = read_env_file(valid_env_file)
    assert env_vars == {"DB_HOST": "localhost", "DB_PORT": "5432", "API_KEY": "abc123"}

def test_env_file_with_comments_and_empty_lines(env_file_with_comments_and_empty_lines):
    # Test a .env file with comments and empty lines
    env_vars = read_env_file(env_file_with_comments_and_empty_lines)
    assert env_vars == {"DB_USER": "user"}

def test_env_file_with_whitespace(env_file_with_whitespace):
    # Test a .env file with whitespace around keys and values
    env_vars = read_env_file(env_file_with_whitespace)
    assert env_vars == {"DB_PASS": "secret", "API_SECRET": "xyz"}

def test_env_file_with_special_characters(env_file_with_special_characters):
    # Test a .env file with special characters in values
    env_vars = read_env_file(env_file_with_special_characters)
    assert env_vars == {"SECRET_KEY": '"my$ecretK3y"', "GREETING": "Hello, World!"}

def test_env_file_with_no_value(env_file_with_no_value):
    # Test a .env file with a key but no value
    env_vars = read_env_file(env_file_with_no_value)
    assert env_vars == {"EMPTY_VALUE": ""}

def test_env_file_with_no_equals(env_file_with_no_equals):
    # Test a .env file with a line that has no equals sign
    env_vars = read_env_file(env_file_with_no_equals)
    assert env_vars == {}

def test_env_file_with_multiple_equals(env_file_with_multiple_equals):
    # Test a .env file with multiple equals signs in a line
    env_vars = read_env_file(env_file_with_multiple_equals)
    assert env_vars == {"COMPLEX_VALUE": "some=value=with=equals"}

def test_env_file_with_duplicate_keys(env_file_with_duplicate_keys):
    # Test a .env file with duplicate keys
    env_vars = read_env_file(env_file_with_duplicate_keys)
    assert env_vars == {"DUPLICATE_KEY": "second"}

def test_env_file_with_unicode(env_file_with_unicode):
    # Test a .env file with unicode characters
    env_vars = read_env_file(env_file_with_unicode)
    assert env_vars == {"UNICODE_KEY": "ünîcødé", "EMOJI": "🚀"}

def test_file_not_found():
    # Test behavior when the file does not exist
    with pytest.raises(FileNotFoundError):
        read_env_file("nonexistent.env")

# Cleanup temporary files after tests
def teardown_module(module):
    for temp_file in [valid_env_file, env_file_with_comments_and_empty_lines, env_file_with_whitespace,
                      env_file_with_special_characters, env_file_with_no_value, env_file_with_no_equals,
                      env_file_with_multiple_equals, env_file_with_duplicate_keys, env_file_with_unicode]:
        os.unlink(temp_file)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Feb 16, 2024
Copy link

codecov bot commented Feb 16, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (2985b66) 56.60% compared to head (23674af) 56.56%.
Report is 20 commits behind head on main.

Files Patch % Lines
embedchain/utils/cli.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1260      +/-   ##
==========================================
- Coverage   56.60%   56.56%   -0.05%     
==========================================
  Files         146      146              
  Lines        5923     5958      +35     
==========================================
+ Hits         3353     3370      +17     
- Misses       2570     2588      +18     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant