Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): near-duplicate detection #8228

Merged
merged 49 commits into from
May 16, 2024
Merged
Show file tree
Hide file tree
Changes from 46 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
2d5d21a
duplicate detection job, entity, config
mertalev Mar 23, 2024
1a3fbb5
queueing
mertalev Mar 23, 2024
77a67b9
job panel, update api
mertalev Mar 23, 2024
3d78182
use embedding in db instead of fetching
mertalev Mar 23, 2024
3b29523
disable concurrency
mertalev Mar 23, 2024
e988110
only queue visible assets
mertalev Mar 23, 2024
0b7b0d2
handle multiple duplicateIds
mertalev Mar 24, 2024
6adf199
update concurrent queue check
mertalev Mar 24, 2024
540c7a2
add provider
mertalev Apr 1, 2024
dbdc113
add web placeholder, server endpoint, migration, various fixes
mertalev Apr 2, 2024
ad70903
update sql
mertalev Apr 2, 2024
7decbb3
select embedding by default
mertalev Apr 2, 2024
fd88cfb
rename variable
mertalev Apr 2, 2024
824ad11
simplify
mertalev Apr 27, 2024
421e1d4
remove separate entity, handle re-running with different threshold, s…
mertalev Apr 27, 2024
b0377b8
fix tests
mertalev Apr 27, 2024
9a3dc52
add tests
mertalev Apr 27, 2024
5369430
add index to entity
mertalev Apr 28, 2024
b238f61
formatting
mertalev Apr 28, 2024
f26c71f
update asset mock
mertalev Apr 28, 2024
f8eacd8
fix `upsertJobStatus` signature
mertalev Apr 28, 2024
ef75fa6
update sql
mertalev Apr 28, 2024
8c0d071
formatting
mertalev Apr 28, 2024
f39d061
default to 0.03
mertalev May 4, 2024
965f1bd
optimize clustering
mertalev May 4, 2024
f217e63
use asset's `duplicateId` if present
mertalev May 4, 2024
be9a678
update sql
mertalev May 4, 2024
ec31d64
update tests
mertalev May 4, 2024
052fb2f
expose admin setting
mertalev May 4, 2024
1baa2fa
refactor
mertalev May 4, 2024
6370428
formatting
mertalev May 4, 2024
185c085
skip if ml is disabled
mertalev May 5, 2024
846fb13
debug trash e2e
mertalev May 5, 2024
0a06662
remove from web
mertalev May 8, 2024
3b40054
remove from sidebar
mertalev May 8, 2024
6284ec7
test if ml is disabled
mertalev May 8, 2024
afaf290
update sql
mertalev May 8, 2024
65b4ed2
separate duplicate detection from clip in config, disable by default …
mertalev May 9, 2024
e8aa664
fix doc
mertalev May 9, 2024
9c07657
lower minimum `maxDistance`
mertalev May 9, 2024
73cc57c
update api
mertalev May 9, 2024
39e5069
Add and Use Duplicate Detection Feature Flag (#9364)
NicholasFlamy May 10, 2024
5160ce5
chore: fixes and additions after rebase
zackpollard May 15, 2024
3167d5c
chore: update api (remove new Role enum)
zackpollard May 15, 2024
5e46b7c
fix: left join smart search so getAll works without machine learning
zackpollard May 15, 2024
a5dbef9
test: trash e2e go back to checking length of assets is zero
zackpollard May 15, 2024
d3eb872
chore: regen api after rebase
May 16, 2024
28a6e62
test: fix tests after rebase
May 16, 2024
95eac75
redundant join
mertalev May 16, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/docs/install/config-file.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,10 @@ The default configuration looks like this:
"enabled": true,
"modelName": "ViT-B-32__openai"
},
"duplicateDetection": {
"enabled": false,
"maxDistance": 0.03
},
"facialRecognition": {
"enabled": true,
"modelName": "buffalo_l",
Expand Down
1 change: 1 addition & 0 deletions e2e/src/api/specs/server-info.e2e-spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ describe('/server-info', () => {
expect(body).toEqual({
smartSearch: false,
configFile: false,
duplicateDetection: false,
facialRecognition: false,
map: true,
reverseGeocoding: true,
Expand Down
11 changes: 5 additions & 6 deletions e2e/src/api/specs/trash.e2e-spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,7 @@ describe('/trash', () => {
await utils.deleteAssets(admin.accessToken, [assetId]);

const before = await getAllAssets({}, { headers: asBearerAuth(admin.accessToken) });

expect(before.length).toBeGreaterThanOrEqual(1);
expect(before).toStrictEqual([expect.objectContaining({ id: assetId, isTrashed: true })]);

const { status } = await request(app).post('/trash/empty').set('Authorization', `Bearer ${admin.accessToken}`);
expect(status).toBe(204);
Expand All @@ -57,14 +56,14 @@ describe('/trash', () => {
const { id: assetId } = await utils.createAsset(admin.accessToken);
await utils.deleteAssets(admin.accessToken, [assetId]);

const before = await utils.getAssetInfo(admin.accessToken, assetId);
expect(before.isTrashed).toBe(true);
const before = await getAllAssets({}, { headers: asBearerAuth(admin.accessToken) });
expect(before).toStrictEqual([expect.objectContaining({ id: assetId, isTrashed: true })]);

const { status } = await request(app).post('/trash/restore').set('Authorization', `Bearer ${admin.accessToken}`);
expect(status).toBe(204);

const after = await utils.getAssetInfo(admin.accessToken, assetId);
expect(after.isTrashed).toBe(false);
const after = await getAllAssets({}, { headers: asBearerAuth(admin.accessToken) });
expect(after).toStrictEqual([expect.objectContaining({ id: assetId, isTrashed: false })]);
});
});

Expand Down
3 changes: 3 additions & 0 deletions mobile/openapi/.openapi-generator/FILES

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions mobile/openapi/doc/AllJobStatusResponseDto.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 38 additions & 0 deletions mobile/openapi/doc/AssetApi.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 16 additions & 0 deletions mobile/openapi/doc/DuplicateDetectionConfig.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions mobile/openapi/doc/ServerFeaturesDto.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions mobile/openapi/doc/SystemConfigMachineLearningDto.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

108 changes: 108 additions & 0 deletions mobile/openapi/lib/model/duplicate_detection_config.dart

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 32 additions & 0 deletions mobile/openapi/test/duplicate_detection_config_test.dart

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.