Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CB-231: Update db access function and helpers for multiple entities. #121

Merged
merged 3 commits into from Jul 19, 2017

Conversation

ferbncode
Copy link
Member

Updated entity_relation_helper and get_place_by_id to fetch info for multiple entities. Function fetch multiple places can be used to fetch info of multple places.

cache.set(key=key, val=place, time=DEFAULT_CACHE_EXPIRATION)
return place


def _get_place_by_id(place_id, includes=[]):
def fetch_multiple_places(*, mbids, includes=[]):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not very familiar with the code, but default mutable values for function arguments (like includes=[] here) can cause problems. https://docs.quantifiedcode.com/python-anti-patterns/correctness/mutable_default_value_as_argument.html

Copy link
Collaborator

@alastair alastair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these methods are getting complex to understand - especially now that you've changed the behaviour but kept the method names. Consider renaming the methods.

Returns:
Dictionary containing info of multiple places keyed by their mbid.
"""
includes = dict() if includes is None else includes
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should includes be a dictionary or a list? It was originally a list, but here you're setting it to a dict if not set. It looks like when you use it you're treating it as a list.
You can use [] here instead of list() too (or {} instead of dict())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, my bad. Fixed.


if 'artist-rels' in includes:
includes_data.setdefault('relationship_objs', {})['artist-rels'] = entity_relation_helper(db, 'artist', 'place', place.id)
entity_relation_helper(db, 'artist', 'place', place_ids, includes_data)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little complex to understand - I think that you're passing in includes_data as an argument and it's getting populated in get_relationship_links. Is that correct?
If that is the case, perhaps we shouldn't call the function get_...... The documentation should be updated to say that this value is altered, instead of being passed in as an argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, includes_data is getting populated in get_relationship_links. I have changed the name of the function to _relationship_links_helper (the function would not be used outside this module). Also, as suggested, I added a Note in the docstring stating that includes_data is altered to contain the relationship objects keyed by the mbid.

return query.all()
relation_type = target_type + "-rels"
for link in query:
includes_data.setdefault(getattr(link, source_id_attr), {}).setdefault('relationship_objs', {}).\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes it's useful to use collections.defaultdict for things like this, where you want to be able to add arbitrary nested objects. Note that this might not serialise nicely in your redis cache and so you might need to convert it before you cache it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated as suggested.

@ferbncode ferbncode force-pushed the multiple-places branch 2 times, most recently from 5b12fad to 5c1fca5 Compare July 12, 2017 17:08
@gentlecat
Copy link
Contributor

gentlecat commented Jul 14, 2017

Apologies for breaking the tests and causing conflicts, but there were some important issues I had to fix. Please rebase this on top of latest master when you can.

Thanks!

@@ -2,18 +2,16 @@
from sqlalchemy.orm import joinedload


def entity_relation_helper(db, target_type, source_type, source_entity_id):
def entity_relation_helper(db, target_type, source_type, source_entity_ids, includes_data):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a good idea to make these arguments keyword-only to make it more explicit when this function is used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a name like get_relationship_info be more informative for this function?

Also if this function doesn't return anything, what does it do? Update one of the arguments passed to it? Can you describe this (in the docstring please)?


Returns:
List of objects of relationships between entities.
Note:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not really a standard way of writing an extended description. Check https://github.com/metabrainz/guidelines/blob/master/Python.md#docstrings. Here's how you could write this:

    """Get relationship links between two entities.

    Keep in mind that includes_data (dict) is altered to contain the relationship objects
    keyed by the source entity MBIDs.

    Args:
        relation (mbdata.model): Model relating the two entities.
        query (Session.query): Query object.
        source_attr (str): 'entity0' or 'entity1' based on which represents source model in relation table.
        target_attr (str): 'entity0' or 'entity1' based on which represents target model in relation table.
        target_type (str): Type of the target entity.
        source_entity_ids (list): IDs of the source entity.
        includes_data (dict): Dictionary containing the includes data of entities.
    """

cache.set(key=key, val=place, time=DEFAULT_CACHE_EXPIRATION)
return place_rel.process(place)


def _get_place_by_id(place_id, includes=None):
def fetch_multiple_places(*, mbids, includes=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case mbids can actually be made not keyword-only. It's clear how fetch_multiple_places([<mbid_1>, <mbid_2>]) would work.

@gentlecat
Copy link
Contributor

@alastair, can you please check the changes?

Copy link
Collaborator

@alastair alastair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now

@gentlecat gentlecat merged commit 921b514 into metabrainz:master Jul 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants