Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor CellSelection data structure and store #3037

Open
wants to merge 101 commits into
base: develop
Choose a base branch
from
Open

Conversation

seancolsen
Copy link
Contributor

@seancolsen seancolsen commented Jul 13, 2023

Fixes #1732
Fixes #2845
Fixes #2130
Fixes #2122
Fixes #3205 via aeae784
Fixes #1893
Fixes #1911
Fixes #2651

Overview of the new approach to CellSelection in this PR

(The following text is written declaratively to describe the expected direction of this PR. The full implementation is still pending.)

  1. Each cell gets a unique string id. The id is a serialization of the row id and the column id. The code to serialize and parse these id values is made more robust and performant, and it's moved out of SheetSelection.ts, to a common location.

  2. The sheet system (polymorphic across the Table Page and Data Explorer) previously held a SheetSelection instance (a granular store). It now holds a Writable<Selection> instance instead (a coarse store).

  3. A Selection instance holds a Plane instance and a Basis instance — both of which are new constructs.

    • Plane is somewhat like a coordinate system for cells in the sheet. It stores all the row ids and column ids in a data structure which permits performant range lookups of certain rectangles, yielding iterators of cell ids in those rectangles.

    • Basis stores all the data about the selection that we need to know for external purposes. This is the source of truth for the selection data; but it also contains redundancies for performance purposes.

      There are three basis types (described more fully within the code comments) which correspond to selection types which are structurally different from one another.

      • For the dataCells basis type, the source of truth is cellIds and activeCellId, with rowIds and columnIds being derived from cellIds. This is the same design as before. Also as before, this basis type is still used when selecting all the cells in a range of rows or all the cells in a range of columns.

      • The two other basis types emptyColumns and placeholderCell will rarely be used, but exist to support special functionality when the table has no data rows or has a placeholder row.

      • For each basis type, a utility function constructs the basis from its raw requirements and computes the derived data as necessary. Basis instances are never constructed outside these utility functions.

  4. Selection has methods to generate transformations of the instance based on different actions the user might perform upon the selection. These methods will allow us to easily build new features like Shift + arrow keys to resize selections.

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the develop branch of the repository
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I added or updated documentation (if applicable).
  • I tried running the project locally and verified that there are no visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@seancolsen seancolsen added this to the Next release milestone Jul 13, 2023
@seancolsen
Copy link
Contributor Author

seancolsen commented Jul 13, 2023

@pavish and @rajatvijay I'm assigning you both to this PR to comment with your preliminary thoughts. Don't submit a review though, because the PR is still in draft state.

I know we agreed to use an RFC process to discuss these changes, but I felt that I needed to get my hands dirty before I could really figure out how I wanted to do this, so I started with some code. This PR does not yet change any code — it only adds new code. My hope is that the scaffolding I've done so far, along with the PR description will suffice in lieu of an RFC. Does this give you a clear enough picture of where I'm headed for us to have a discussion about the direction of this refactor?

@pavish
Copy link
Member

pavish commented Jul 17, 2023

@seancolsen

I put some more thought into this and looked through the code, and while I do not like the string representation for cell ids, I don't see a performant way around it. So, I'm okay with it.

I'd like one more usecase to be considered: When a user clicks on a column header, I'd like the selection to persist when the user moves across pages.

This will require the column being decoupled from the cell selections (& it'd be great if we can also extend it to row selections).

With the design mentioned, this would be tricky, but I do think a new basis type and some additional states would help enable this.

This would greatly improve the UX of the inspectors where the selected tabs keep getting resetted everytime there's a page change. If we proceed with the refactor without this usecase, it'd be really hard to factor it in, later on, without having another refactor.

I'd also like to take a look at the utility methods needed by the components, which aren't present in the draft yet. eg., methods like isAnyRowCompletelySelected, isAnyColumnCompletelySelected etc., If we can find a way to optimize these methods as much as possible by removing the need to have to iterate through all the rows/columns, it'd make our entire selection UX more performant.

@seancolsen
Copy link
Contributor Author

@pavish and I discussed this briefly via chat.

The use case he presents is definitely worth handling, and I don't think it will be too hard to handle.

Basically the sheet would hold a Plane instance. When the user paginates (and does other things too), the sheet would generate a new Plane instance. Any time the Plane instance changes, the selection would be updated via its forNewPlane method. Within that method we'll be able to incorporate logic like "if all the cells within the column are selected, then keep that column selected"

Pavish said that he approves of the basic approach in this PR and that I can move forward with further implementation.

@seancolsen
Copy link
Contributor Author

@rajatvijay Heads up that I'm going to continue working on this tomorrow when I start my day. Please take a look when you begin your day if you'd still like to have an opportunity to review my approach.

@rajatvijay
Copy link
Contributor

@seancolsen & I had a chat about it. The approach outlined here is focused on the stability of the cell selection feature. Regarding perf - the goal is not to improve it but definitely not to make it worse.

LGTM 🚀

@seancolsen seancolsen modified the milestones: Next release, v0.1.4 Aug 17, 2023
@ghislaineguerin ghislaineguerin added this to the High priority milestone Apr 15, 2024
@seancolsen seancolsen marked this pull request as ready for review April 29, 2024 15:23
@seancolsen
Copy link
Contributor Author

seancolsen commented Apr 29, 2024

This PR is finally ready for review!

Review

@pavish I'd be happy to hop on a call to go over any questions that you come up with if you think that would be helpful.

Since we don't plan to make a release for a while, my suggestion is to try to get this merged soon so that we have a good amount of time with our team using the app to spot regressions.

QA

There's quite a lot to QA here.

@ghislaineguerin I've assigned this PR to you so that you can do prescriptive QA on it from a user's perspective. This PR makes a bunch of low-level changes to the way that cell selection works. For QA I recommend focusing on these things:

  • Application areas to QA:
    • Table page, table widget, data explorer, import preview, record selector. These areas all use the "Sheet" system, which is affected by this refactor.
  • Features to QA:
    • Pagination
    • Adding/deleting columns
    • Column reordering via drag/drop
    • Column resizing
    • Drag to select multiple rows
    • Drag to select multiple columns
    • Shift+click (or even Shift+click-and-drag) to select range
    • Inspector tab is being auto-selected in both table page and DE
    • Table page with grouping applied
    • Auto-scrolling to cells based on active cell and selected cells
    • Checkbox cell
    • Copy cells from table and from data explorer
    • Ability to arrow-down into cells within placeholder row, and add data
    • Trying to move into the placeholder row when it doesn't exist (e.g. in table share when no permission to add record)
    • Selecting cells within an unsaved new row
    • Adding new records via placeholder row
    • Adding new records via New Record button

During QA, try to pay particular attention to the way that different features interact with the different ways that cells can be selected. Test edge cases you can think of. Try to identify any regressions. If you find problems, make sure to test against the latest release to determine if it's a longstanding bug or a regression.

@seancolsen seancolsen added the pr-status: review A PR awaiting review label Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment