Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up merge_metadata_dict() by 13% in embedchain/memory/utils.py #1262

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

misrasaurabh1
Copy link
Contributor

Description

📄 merge_metadata_dict() in embedchain/memory/utils.py

📈 Performance went up by 13% (0.13x faster)

⏱️ Runtime went down from 35.80μs to 31.60μs

Explanation and details

(click to show)

Here's a refactored version of your program that merges two dictionaries more efficiently.

This rewritten function eliminates the unnecessary check of isinstance(merged[k]). Since we have already checked that merged[k] and v are of the same type, just checking v is sufficient. This reduces the number of checks and thus makes the function faster.

Type of change

Please delete options that are not relevant.

  • Refactor (does not change functionality, e.g. code style improvements, linting)

How Has This Been Tested?

  • Test Script (please provide)

✅ 14 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
import pytest  # used for our unit tests
from typing import Optional, Any, Dict
from embedchain.memory.utils import merge_metadata_dict

# unit tests

def test_both_inputs_none():
    assert merge_metadata_dict(None, None) is None

def test_left_none():
    assert merge_metadata_dict(None, {'key': 'value'}) == {'key': 'value'}

def test_right_none():
    assert merge_metadata_dict({'key': 'value'}, None) == {'key': 'value'}

def test_both_empty():
    assert merge_metadata_dict({}, {}) == {}

def test_left_empty():
    assert merge_metadata_dict({}, {'key': 'value'}) == {'key': 'value'}

def test_right_empty():
    assert merge_metadata_dict({'key': 'value'}, {}) == {'key': 'value'}

def test_no_overlapping_keys():
    assert merge_metadata_dict({'key1': 'value1'}, {'key2': 'value2'}) == {'key1': 'value1', 'key2': 'value2'}

def test_overlapping_keys_same_type():
    assert merge_metadata_dict({'key': 'value1'}, {'key': 'value2'}) == {'key': 'value1value2'}

def test_overlapping_keys_different_type():
    with pytest.raises(ValueError):
        merge_metadata_dict({'key': 'value1'}, {'key': 123})

def test_nested_dictionaries_no_overlap():
    assert merge_metadata_dict({'key': {'subkey1': 'subvalue1'}}, {'key': {'subkey2': 'subvalue2'}}) == {'key': {'subkey1': 'subvalue1', 'subkey2': 'subvalue2'}}

def test_nested_dictionaries_overlap():
    assert merge_metadata_dict({'key': {'subkey': 'subvalue1'}}, {'key': {'subkey': 'subvalue2'}}) == {'key': {'subkey': 'subvalue1subvalue2'}}

def test_deeply_nested_dictionaries():
    assert merge_metadata_dict({'key': {'subkey': {'subsubkey': 'subsubvalue1'}}}, {'key': {'subkey': {'subsubkey': 'subsubvalue2'}}}) == {'key': {'subkey': {'subsubkey': 'subsubvalue1subsubvalue2'}}}

def test_complex_dictionaries_mixed_types():
    assert merge_metadata_dict({'key1': 'value1', 'key2': {'subkey': 'subvalue1'}}, {'key2': {'subkey': 'subvalue2'}, 'key3': 123}) == {'key1': 'value1', 'key2': {'subkey': 'subvalue1subvalue2'}, 'key3': 123}

def test_overlapping_keys_unsupported_merge():
    with pytest.raises(ValueError):
        merge_metadata_dict({'key': [1, 2, 3]}, {'key': [4, 5, 6]})
    with pytest.raises(ValueError):
        merge_metadata_dict({'key': set([1, 2, 3])}, {'key': set([4, 5, 6])})

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Made sure Checks passed

@dosubot dosubot bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Feb 16, 2024
Copy link

codecov bot commented Feb 16, 2024

Codecov Report

Attention: 6 lines in your changes are missing coverage. Please review.

Comparison is base (2985b66) 56.60% compared to head (7e06e47) 56.59%.
Report is 20 commits behind head on main.

Files Patch % Lines
embedchain/memory/utils.py 0.00% 6 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1262      +/-   ##
==========================================
- Coverage   56.60%   56.59%   -0.02%     
==========================================
  Files         146      146              
  Lines        5923     5955      +32     
==========================================
+ Hits         3353     3370      +17     
- Misses       2570     2585      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant