Friday, September 28, 2012

Why is PayPal interesting technology-wise?

I work at PayPal and find that the set of problems we have to solve here is both quite interesting and challenging. How come?

From the outside, PayPal seems simple - a multi-channel, global payment platform.

Consider this, however - every government in the world forcefully imposes and continuously tweaks the rules governing money movements.

Typically, financial institutions have separate operating units in every market, with independent software/systems implementing those rules as well as market-specific product features. Sometimes, you even find major financial institutions with entirely separate systems serving different regions within US - so much so that in at least one instance an account registered on the East Coast could not be accessible under the same set of online credentials as accounts established on the West Coast.

Not so for PayPal: As a true Internet company, PayPal relies on a single code base and (distributed) infrastructure that serves all customers around the world. The benefit to customers is that you can instantly pay anyone globally. The flip side is a highly complex code base that has to support constant and rapid evolution and significant requirements flux.

This presents an interesting problem set in the areas of software engineering & architecture, otherwise rarely found on such a scale. And because of its phenomenal growth, PayPal has the wherewithal to tackle these problems and aggressively pursue innovation.

Thursday, September 20, 2012

Pivoting towards rules-based application logic

Many applications are initially simple and straightforward, but gradually become more and more complex once specialization based on geographical market, user segment, etc. is required. The same problems arise when SaaS apps transition to true multi-tenant implementation from a set of bespoke, tweaked solutions for anchor clients.

At some point, managing programmatically-implemented use cases through complex, branching conditionals becomes too onerous. At this point, it's common to see the architeture pivot towards declarative, rules-based implementations.

The key to making this transition successful is in the interface between the app and the rules-based framework. Since these rules subsume portions of the app logic, one often has to pass a fairly rich application context as argument at the interface between them.

Constructing the context that is rich enough to maximize the diversity of effectively executable rules yet still well-structured is a good exercise in abstracting key inputs to app logic from mere artifacts of app's internal state.

What rules-based framework returns back to the app is effectively a set of decisions expressed as a set of parameters that define the behavior of the remaining app logic.

One final thought: mature enterprise apps often have multiple sub-domains implementing rules-based logic. In such cases, providing a fa├žade that hides orchestration over all such sub-domains simplifies the app even more. However, this requires an even more careful modeling of both the context as well as result set.

Sunday, September 16, 2012

Technical architecture - what's on your mind?

Questions I typically ask myself when designing a system:
  1. Isolation/decomposition - what are the separable concerns?
     
  2. Encapsulation - how to optimally package those concerns?
     
  3. Intermediation & integration - how to compose and orchestrate over multiple components? Synchronous/blocking? Asynchronous/even-driven? Asynchronous/batch-driven?
     
  4. Consistency/Availability/Partition-tolerance trade-offs? Eventual consistency? Does 'OK' mean 'fully committed transaction' or 'i've recorded enough information to achieve globally consistent commit at some point in the future'?
     
  5. Business logic - programmatic, data-driven & rules-based?
     
  6. Entity models - extensibility/flexibility vs. performance optimization?

These are rather generic, off the top of my head. 

Thursday, September 13, 2012

Loose coupling & integration - the pendulum swings

We've all learned that separation of concerns is a fundamental good in software/systems design. Often, what is seen as a logical next step is to package these concerns into separately deployable artifacts - as services requiring remote invocation, for example.

Many a startup goes through these phases - start simply, with monolithic codebase and relational DB as a universal persistence mechanism. If the startup is successful, its growth inevitably requires more - more features, more specialization based on market/locale and user segments, more experimentation, integration of new and acquired capabilities, external systems, partners, etc.

At some point, the monolithic code base becomes too complex and the engineering organization too large for every engineer to understand most of the code. So the order of the day is... isolation! Draw reasonable boundaries within the code and attempt to create good interfaces between isolatable portions (hopefully hiding implementation details of each from the others).

And how can one isolate various code domains from each other in a way that makes enforcement of dependency management simple? Well - SOA (service-oriented architecture) of course!

So soon enough, instead of a monolithic code base, there's a rich panoply of services. But not without a cost - creating a lot of services and forcing remote invocation in the name of isolation is far more expensive than running execution within the memory space of a single process. Intermediation between a multitude of services becomes the next challenge - with far from trivial issues of orchestration, latency, topology & discovery, geo-distributed failover, etc. If processing of the same request now requires a cascade of a dozen or more service calls, how does one still make it performant? How to handle versioning required across common interfaces - supporting both the new and the legacy clients?

Attacking these issues by sheer brute force is usually an expensive proposition - with costs rapidly rising. The only answer I found is ensuring an adequate investment in quality technical design/architecture.

Tuesday, September 11, 2012

Processing batch/bulk requests

It's fairly common to get requests for creating a batch version of a previously transactional API. What are the main questions to ask when considering batch implementation:

  1. Atomicity - what operations should be considered atomic, all-or-nothing (i.e. not admitting partial failures)?
     
  2. Since sizable batches cannot be processed atomically, how to handle partial failures? This quickly leads to:
     
  3. Idempotency - how to prevent erroneously submitted duplicate requests from creating snowballing failures and data corruption throughout the system?
     
  4. Downstream effects - how to ensure that downstream systems that depend on asynchronous processes, such as ETL, work well with different load patterns created by upstream batch requests?
So introducing batch requests into the system without compromising consistency and while maintaining load & performance SLAs is not always a trivial task, which makes it interesting!

Thursday, September 6, 2012

Agile vs. waterfall [software development] - the grand bargain

In waterfall-style development, the implied bargain is: product owner provides complete requirements and developer provides cost estimates and execution timeline. Of course, completeness/finality of requirements is pure fiction. So the bane of waterfall model is that every significant change in requirements requires re-estimation and re-planning, wasting huge amounts of time for all involved.

In agile, the core bargain is different - everyone saves time by not producing or parsing huge requirements docs (which often really are comprised of vast stretches of boilerplate with useful nuggets of info hidden here and there). Product owner gets flexibility for requirements and scope changes along the way, to a degree. Developer gets a committment to fund the implementation effort through initial launch and subsequent tweaks/experimentation.

Agile is not an excuse for lack of technical design - the architect or engineering lead still need to isolate the salient aspects of the product to ensure what they're building will at its core be ble to stand the test of time.

Tuesday, September 4, 2012

Against ivory towers

Does taking an architecture role relegate one to an ivory tower of high-level abstractions, processes and governance? Absolutely not!

Architects are responsible for ensuring there's no gap between high-level designs they create and what is ultimately implemented by developers. If an architect articulates those designs in high-level pattern language that is not readily consumable by the engineers, it's not good enough.

Architects have to provide clear translation of what those patterns mean in terms of CODE - be it draft implementation of core abstractions and key interfaces, pseudo-code explaining the patter, etc. But 'naked' high-level design documents are usually insufficient.