Skip to main content
SearchLoginLogin or Signup

We’ve Taken a Leap Forward with our Synapse Python Client. Here’s What Users can Expect

What the latest client updates mean for the performance and sustainability of your data management

Published onMay 20, 2024
We’ve Taken a Leap Forward with our Synapse Python Client. Here’s What Users can Expect
·

A letter to our users,

The Synapse Python Client, developed by Sage Bionetworks, provides a programmatic interface for researchers and developers to interact with the ~3 petabytes of data at scale. This article delves into the transformative shift towards Object-Oriented Programming (OOP) and the integration of asynchronous (async) operations in the Synapse Python Client in a future major version release. 

The Shift Towards Object-Oriented Programming

Object-oriented programming allows us to encapsulate data and behavior into objects, mirroring Synapse resources, thus making software design more intuitive, manageable, and scalable. The shift towards OOP in the Synapse Python Client signifies a leap towards more robust, maintainable, and flexible code management. The support for our functional programming use cases will remain, but having a robust object-oriented approach to the underlying codebase will help with the long-term sustainability of the client.

Why does OOP matter?

1. Modularity for Easier Troubleshooting: Breaking down complex processes into smaller, encapsulated objects makes it easier to understand, debug, and improve specific parts of the system without risking the integrity of the entire system.

2. Reuse of Code through Inheritance: OOP allows for the creation of a more generalized form of a class, which specialized subclasses can inherit from. This reduces redundancy and promotes code reuse, speeding up the development cycle.

3. Flexibility through Polymorphism: Polymorphism enables a single interface to represent different underlying forms (data types). An example of this in the Synapse Python Client is having a permissions interface that allows for the abstraction of this logic, but also the incorporation of this logic into relevant Synapse resources. 

4. Data-centric: OOP provides a data-centric view by combining data and behavior into objects, aligning with Synapse resources. This approach streamlines data handling within code, enhancing clarity and facilitating more effective data management. Developers can intuitively interact with and manipulate data by structuring code around data objects.

Learn more about object-oriented programming in Python: https://realpython.com/python3-object-oriented-programming/

Asynchronous Programming: A Leap in Efficiency

Asynchronous programming is a model that allows multiple tasks to run concurrently, improving the responsiveness and performance of applications, especially those that are I/O bound or network-intensive. This enhances how developers interact with Synapse services through the Python API.

What are the advantages?

1. Non-blocking I/O Operations: Traditional synchronous operations block the execution thread until the operation completes. On the other hand, asynchronous operations allow the program to run other tasks while waiting for I/O operations to complete, making the client more efficient and responsive.

2. Improved Performance and Scalability: By enabling concurrent operations without the need for multi-threading, async programming can significantly improve the performance of the Synapse Python Client, particularly in data-intensive applications.  We expect a 2x decrease in upload and download times, as the Synapse server permits.

3. Better Resource Utilization: Async programming leads to more efficient use of system resources, as it reduces the need for thread management overhead and leverages the system's I/O capabilities more effectively.

4. Simplified Error Handling and Maintenance: Async/await syntax offers a more straightforward way of handling asynchronous operations, making the code cleaner, more readable, and easier to maintain.

Don’t believe me? See benchmarking results for yourself. We can achieve close to the same upload speeds as aws s3 cli

Learn more about asynchronous programming with Python here: https://realpython.com/async-io-python/

Why You Should Care

These changes signify a renewed focus on the development experience a tool can provide. It has practical implications for developers and researchers. The shift to OOP enables improvement in software design, making code more intuitive, manageable, and scalable. Integration of async operations enhances the client's adaptability for handling small and large-scale data efficiently. These changes emphasize a commitment to providing a streamlined development experience that ensures long-term sustainability.

How You Are Impacted

This does mean there may be major changes in behavior. 

There will be parts of the client that are non-backward compatible.

Import synapseclient

syn = synapseclient.login()

file_entity = syn.get(“syn3333”)

file_entity.my_annotation # This will no longer work

file_entity.annotations.my_annotations # This is the expectation

More details to follow soon about how the API will change within the release notes.

Looking Forward

We understand the importance of supporting our scientific communities with simple-to-use functions that will utilize these changes made in the future v5 release of the client. Embracing Object-Oriented Programming and asynchronous operations programming will allow the Python client to not only become more efficient and scalable but also offer a more intuitive and robust interface for developers who want to build applications on top of Synapse with the Python API. As the Synapse platform evolves, these advancements in its Python client pave the way for more efficient data analysis and management, ultimately accelerating scientific discovery and innovation.

Look out for our newest release (4.3.0) by following our Python client news.

Comments
0
comment
No comments here
Why not start the discussion?