You’ve heard that microservices allow development teams to use any language - how do you start? What is a microservice, really, anyway?
There’ve been tweets of microservices, emphasizing that they’re very short in code length and speedy to write, but is that really all they are, just a short bit of code?
Here’s an example using groovy with Pivotal’s Spring Boot:
@Grab("spring-boot-actuator")
@RestController
class GreetingsController {
@RequestMapping("/hi/{name}")
def hi(@PathVariable String name) {
[greeting: "Hi, " + name + "!"]
}
}
The magic of microservices comes from the microservices infrastructure that’s around the “short” bit of code. In the example above, there’s groovy which sits on a jvm, but before it gets there, there’s maven projects providing all the annotation support, and the spring-boot-actuator which provides very useful convention-based endpoints for /trace, /metrics, and /info allowing monitoring of a service.
Although that’s a few lines of code to stand up a service with some known API endpoints, but it’s still not enough for a microservice.
A microservices infrastructure provides the devops team (the code developer, tester and deployer) a suite of functionality, typically also implemented in a microservices architecture style. Microservices development also assumes that the microservice manages it’s own access to other services and can deal with failure very well. Further, a microservice may have multiple copies deployed, to ensure against all sorts of distributed service latencies or failures.
Components of a microservices infrastructure include:
a registry
a library for providing convention-based endpoints for intercommunication (a regular API)
a library for fault tolerance / resiliancy patterns
a library for client-side load balancing
a health and resiliency monitoring library
a library for autogenerating access to other microservices
a library to allow proxy access for other microservices the microservice may want to expose
These libraries can be implemented via annotations in Java, as as spring-boot-actuator does for a regular API, above, or as actual libraries included in the source of the microservice. The management of libraries via maven or some other mechanism is outside the scope of microservices - that’s a traditional dependency management issue.
Patterns to Date
Code Generation: Pivotal’s Spring Boot & Spring Cloud wrap groovy or java with their annotations is an example of this.
Agent-Container: IBM, in this paper (slides), describes an agent infrastructure that’s added to a standard container. I like to call this heavy dependency “macroservices.”
Platform: A microservices “platform” - a small environment that includes all the prerequisite dependencies (libraries, etc.) that the core microservice can use. Different from an agent-container, this platform is the fabric on which a microservice is deployed. Scaling and distribution of the platform become an issue. Pivotal, again, is moving this direction with Cloud Foundry.
There’s a lot of information out there about microservices. In this post, I’m going to describe how I see people approaching microservices and point out some of the good bits of information to get familiar with the concepts. Fair warning, this is somewhat of a meta-post.
Two Common Approaches
SOA: A great starting point
After reading a few articles on microservices, people that have been creating services or have read about services for the last few years inevitably compare microservices to Service Oriented Architecture (SOA). Apart from the silly comparison that “Service Oriented Architecture” has 27 letters and a 3 letter acronym that can be pronounced (unless you’re an initialism person) and “microservices,” while being 13 letters, has a 2 letter abbreviation “μs,” but is pronounced the same with no added benefit to shortening - a sort of architectural irony - there’re a lot of good parallels with microservices and SOA.
Approaching microservices via SOA is a great starting point since many in the software industry are familiar with SOA. Some of us greybeards will get a bit prickly and bring up things such as CORBA, etc. - and they’re right: the name may be new, but the concepts have been around for a good long time. One of the first instances of the term microservices is in discussions of the linux microkernel (add reference here). Further, microservices are almost always implemented in a unix-style of composition, small singly focused services, used together.
Most of us in the industry actively acknowledge that vendor-driven SOA has hit it’s plateau of usefulness, teetering on being considered harmful - damaging the principles that SOA stands for (unless, of course, you work for an ESB vendor) and of software architectures based upon SOA.
Classic SOA principles still apply to microservices such as self-contained, loosely coupled, reusable, business-oriented services that communicate via a (typically) network protocol, but SOA has baggage. That baggage is primarily due to the vendor-driven implementations of SOA applications, such as ESBs, and protocols, such as SOAP and the WS-* suite, which have engendered monolithic applications, nominally are made up of services. Consider microservices as a new approach to implementing SOA, mixing in concepts of distributed computing and lean, and you’ll have the lineage. A great analogy is also the relationship between agile, and an implementation of agile, like scrum or kanban or XP.
Microservices is SOA principles done right.
Implementations: Blind groping at elephants
The other day, one of my friends IM’d me asking if I’d heard of CQRS because some of his colleagues were super excited and going to implement it. Apparently, there’s a .NET article or two espousing the wonders of Command Query Responsibility Segregation and other developers on his team were holding a meeting to discuss using it and he wanted to know my opinions on it. I’ll talk a bit about CQRS and other patterns, later, but this entré to microservices, especially for existing applications, a bit like randomly picking a page out of the Gang of Four Design Patterns book and refactoring everything based upon that.
Another - and better - approach that people take towards grasping microservices is via containers, like Docker, or practices like devops. Knowing about containerization implies that the person or organization values some level of immutability in deploying applications as services and has some buy-in for this style of development and deployment. Devops ups that buy-in with small teams focused on producing products and owning the full lifecycle of the product. The icing on this perspective on microservices is that it’s a natural exension of the trends of agile->devops->continuous delivery.
A colleague of mine, Ian Goldsmith, drew this diagram to represent this conception:
The sources: a brief review
One of the seminal articles on microservices is Fowler and Lewis’s Microservices which espouses microservices as an architectural style and provides 9 principles of microservices. I’ll go over these shortly, but one thing to highlight from Fowler & Lewis’s article is that they make clear that the principles aren’t criteria for conformance. Another important work is Sam Newman’s Building Microservices where he provides a substantial amount of discussion around the 7 principles he’s encountered.
You might note that there’s no authoritative set of principles, patterns, or even implementations - even though there are many principles, patterns, and implementations. Microservices is an emerging and evolving field of software architecture with a solid body of discussion and active practitioners which are providing this knowledge.
Adrian Cockcroft of Battery Ventures, previously of Netflix - one of the innovators in the microservices field - describes microservices like this: “Loosely coupled service oriented architecture with bounded contexts.”
Sam Newman of ThoughtWorks and another microservices expert, says this about microservices: “Small autonomous services that work together.”
Working through reading articles on microservices will give a more holistic approach about where and why microservices arose and when and how to apply microservices architecture principles.
Other Approaches?
Before we move into a deeper inspection of the sources - are those two common approaches to microservices “wrong?” Or, asked another way - again by a friend of mine while I was droning on about this subject, paraphrased - why am I such a hater?
To be clear, I’m not a microservices hater and I’m not really a hater on anyone who approaches microservices those ways, it’s just that they’re incomplete. Glomming on blindly to the latest fad in software development is not a great thing - eventually, the way software moves, something at the edges will become more mainstream and better understood. In the case of the intrepid CQRS adopters, they’ll eventually figure out that cherry picking a specific feature of microservices won’t yeild the magical results they might be expecting.
A historical SOA approach is reasonable, too, but software people by-and-large are pragmatic people and a “history of software” lesson tends to be less instructive than just getting in and doing it.
The insight I got from reading about and deploying microservices is as follows: Implementing microservices - whether with an existing application or green field - requires some prerequisites, without which, the objective of scale isn’t being achieved. Some of these prerequisites are much easier to understand than others, primarily due to our human nature of trying to analogize with past concepts.
Devops is a clear one - being able to have an agile, cross-talent team deploy what they build - but it’s not enough. Having an organization on the way to or at achieving continuous delivery is a prerequsite. Being able to deploy in a discrete manner that automates testing and load along the way is an advanced organizational behavior. Not everyone’s there yet, not everyone’s at devops, and not everyone needs that level of intricacy to behave this way to deploy software.
Decentralized everything is another. While that sounds vague, because it is, it implies no “layered” architecture. No abstracting security to a security infrastructure, no abstracting orchestration to some orchestration gateway, no abstracting of governance to a governing body. This one’s also difficult to achieve for existing organizations that built around these processes of separated architectural and organizational layers. Why’s this important, though? Well, to move at scale, microservices dictates that the unit of a service be fully secured, governed, and managed by the “two-pizza” team. For green field microservices project, this isn’t too hard a concept. Refactoring an existing organization can be.
A corallary to the above is domain-driven design for data - decentralizing the data, creating bounded contexts, and therefore relying on some level of eventual consistency in the data which services access. Additionally, dealing with failure - whether network, self, or data (inconsistencies) is another overlooked aspect. This is where one of the parents of microservices - distributed computing - rears it’s head. Microservices puts a burden on the service design team to deal with the data for the service and the service alone as well as fault tolerance behaviors. No calling out to anything else to get this done, resulting in additional code. Shared libraries can help here, but the onus is on the service developers. In larger organizations, “boilerplate” gets abstracted (for example, in the case of security) to another “layer” or even another group.
Scale occurs at every level - Organization, systems, devices, development, testing. And not everyone needs to scale this way.
Microservices is SOA at scale.
On Microservices
Microservices Drivers
Mentioned above, the disillusion of SOA when used with “vendor-driven SOA” products provided a great impetus for practitioners to go back to SOA’s first principles and remix them with best practices of modern software development. SOA, at its roots, is not about implementation but of organization, of the architecture of a system. Scaling a system has been the greatest priority of the “internet 2.0” or “web scale” era - taking the “move fast and break things” mantra of Facebook to heart and making it stable.
To be very clear here, scale is one of the prime drivers. The goal isn’t speed of invocation, it’s throughput - how many requests can be handled, elastically, rather than a finite amount quickly.
Also frequently cited - so much so, that the in-joke is that you can’t have a microservices article without mentioning it - is Conway’s Law, from Melvin Conway (inventor of the Game of Life), from 1968:
organizations which design systems are constrained to produce systems which are copies of the communication structures of these organizations
This implies that the methods of communications between groups - say your project management, dev, qa, and ops groups - defines the process by which your system will be built. There’s your nice, flawed waterfall model of system design and development.
Recasting SOA in a modern light has meant that practitioners have had to discover the gaps in SOA and accommodate for them. This has led to the principles in Fowler and Lewis’s article, mentioned above:
Componentization via services - a SOA-originating concept, with the modern concept of adhering to published interfaces and being very careful when changing those interfaces
Organized around business capabilities - SOA principles of focusing on business features, and also Uncle Bob Martin’s single responsibility principle
Products not projects - Cue Conway’s Law reference here, but also the “you build it, you run it” aspects of devops. Note that a monolithic application could do this, too, but when things are so related, there tends to be heavy context leakage
Smart endpoints, dumb pipes - lightweight communications with no central communications; choreography vs. orchestrations, no ESBs
Decentralized Governance - Each microservice knows what to do for itself and aggregate microservices interact in a unix pipe-and-filter way
Infrastructure automation - microservices team controls deployment, further, there’s an infrastructure that’s necessary for microservices - whether it’s a registry, health dashboards, etc. those have to exist for microservices to thrive (and are probably microservices, themselves)
Design for failure - services, not some external thing, must accomodate for failure; plan for it, code for it, report failure
Evolutionary design - microservices will evolve; granularity of microservices will change
Tradeoffs
As with any architecture choice, there are tradeoffs. For microservices, I’ve compiled a few, here:
Provisioning of individual services puts a burden on operations. This is typically mitigated/accepted (with relish!) via devops
Communications between microservices isn’t defined - including routing, integration and discovery - thereby putting the burden on clients. Concerns from vendor-driven SOA / ESB vendors show up here, too: There’s no global entity (like an ESB). Clients will typically handle failure and control their own state, rather than abstracting these functions and delegating them to a global controller.
Remote calls are more expensive than in-process (single monolith app) calls and remote APIs become more coarse grained. This tradeoff is accepted because the boundaries between services are much more defined, allowing for scaling.
Courser-grained APIs may mean more calls between services. Again, this trade off is accepted in order to provide scalabiliyt as it’s easier to scale out and down than doing the same with monloithic apps.
Centralization of choreography puts a burden on microservice designers to understand that their service may beused in a pipe & filter pattern. This requires microservice designers to build an API for their service that can be used in this fashion and to keep that API relatively stable.
Fragility is introduced by having multiple, granular services. Compensation via resliency patterns is the typical answer to this tradeoff.
Service dependencies through HTTP. Since many microservices prefer the REST/HTTP communications paradigm, although at extreme scale a serialized RPC mechanism (protobuffers, Thrift) may be used, some see HTTP as a non-efficient mechanism. I’m not sure harping on HTTP is really a concern. There’s HTTP/2 + protobuffers (et al.) that provides great mechanisms for communication. Further, scalability benefits are preferred rather than transparency in communication (such as “human readable” XML).
Multiple datasources for decentralized management put a burden on data managers to understand Bounded Context design pattern; no single logical database; updates become more complex; eventual consistency
Increased monitoring with the assumption to fix microservice connectivity problems puts an additional burden on operations as well as architecture design. Lots of moving parts means that monitoring is essential - this is an additional infrastructure / system burden.
SLA is only as good as the weakest link - resiliency patterns, contract-based interaction help here.
I was asked this question today: “How can we ensure that requests hitting our API gateway are coming from our mobile app?” What sort of guarantees are available to API publishers and to API consumers that their traffic is secured? There’s enough out there with the Snowden releases and Heartbleed that security awareness is at an all time high that this question deserves a good bit of thought put into a response.
tl;dr You can’t be 100% sure.
It would seem, on the face of it, that a proper security researcher answer would be “you can’t ensure any sort of security, it’s all effectively post-event fraud detection and mitigation responses,” considering the multiple ways that hackers can insert man in the middle attacks from the recent “Factoring attack on RSA-EXPORT Keys” (FREAK) attack, to malware DNS poisoning, and even idiotic things like a “sign-all” Superfish certificate on one’s machine.
Mobile apps connect to remote servers via web-based APIs and there’re a few different strategies to protect that traffic. The simplest (and most reductionist) way may be to have the client provide some sort of unique identifier, such as an HTTP user agent string. That can easily be spoofed by capturing network traffic and then replicating the user agent string with some other attack software in a very reductionist form of replay attack.
Encryption, while table stakes for providing some sort of security, isn’t going to do it either. TLS (Transport Layer Security), also referred to as SSL v3.1, is a start, but there are and have definitelybeenflaws and undoubtedly more to come. A standard way that API publishers attempt to guarantee authenticity of the transaction is via an API or app key by assigning a key and secret to the app consuming the API. Where do app developers put this key and secret? Sometimes, they embed it in the mobile app, effectively putting the key and secret out there for anyone to brute force or decompile out of the mobile app binary. Android, iOS, and Cordova are all subject to key extraction. Similarly, attempting to use a private key to sign requests still requires that private key to be on the mobile device, with the same vulnerabilities, including replay attacks once a key’s been generated or extracted.
Here's Cory Doctorow, being more eloquent than I can ever be, on the topic:
We know that DRM doesn't work for some security basic reasons. If you deliver to a attacker, the cypher text, the cypher and the key, and rely on the attacker not combining those except under circumstances as you dictate, you are living in a fool's paradise. ... This is why these things take years and millions to develop and are broken in minutes for pennies by kids. Right, not because the engineers that develop them are stupid, but because they are chasing a fool's errand. [27:30] So we're chasing an impossibility here. The idea is that we will have a world in which bits can be widely copied with permission and can't be copied at all without permission. And those of you of a technical background, and I assume that is all of you, know this is an impossibility. Bruce Schneier says "making bits harder to copy is like making water that's less wet". There is no future in which bits will get progressively harder to copy. Indeed if bits did get harder to copy, it would be alarming. It would mean some of our critical infrastructure had stopped working ... Barring nuclear catastrophe, from here on in, bits only get easier to copy. And yet we're chasing a future where bits will get progressively harder to copy. [26:30]
Another sort of non-option is to push the security back to the API layer - don’t put keys/secrets on the mobile app, but provide an API proxy, server-side, that holds the keys, secrets or whatever, and then relays the call to the actual target API. Mobile hosters like appery.io do something similar, keeping the API keys on their hosted proxy, and generate a Cordova mobile app that interacts with their servers. With this option, while the app keys aren’t scattered across the world, there’s still an attack surface, but it’s radically minimized. And with the keys in house, there’re more options for security servers and networks within an organization’s control. Additionally, there’s no way to guarantee that it’s the mobile app that’s making a call to the server or some skript kiddie’s hack bot.
All other options get harder to implement and tend to become no-ops for organizations wanting to create a mobile application - creating alternative flows and custom libraries. The best way would be to issue a app key and token to every app (and maybe even session) that interacts with your mobile-exposed API by having a communication handshake that requests, receives and then uses a one-time token. There’s no guarantee that the process of this handshake can’t be figured out by a combination of watching the network traffic and breaking TLS. Lastly, custom libraries that aren’t found in common Android, iOS, or Cordova can be used to obfuscate the communications patterns between mobile device and server. Obfuscation isn’t security, it’s just a roadblock. Roadblocks are worth something when considering security, but they don’t guarantee that the application is a valid application.
Whitelisting all apps is a potential solution, but only a potential one, as it doesn’t scale well.
All of this leads back to mitigation solutions: assume the key will leak and focus on what to do when that happens. Adopting a defensive position is very important, also practically table stakes for being serious about security. Automated monitoring and known-good patterns are hallmarks of classic fraud detection systems. Watching patterns can help organizations indirectly determine whether a connection is coming from a mobile app - expected use patterns can be turned into dynamic operational policies at (currently) great computational expense and (currently) only in a custom manner. Enter big data and machine learning.
Lastly, here's an insightful postmortem on a hack, providing nuggets like this: "Companies are continuously balancing the small risk of compromise against the broad benefits of convenience."
Sunday, September 1, 2013
Deploying the latest Google Glass Mirror API Java Quick Start to AppEngine
The Google Glass team has a series of quick starts, a code kitchen sink, if you will, in a few different languages to get developers rapidly up to speed with the Mirror API, a web API (json/http) way of interacting with Google Glass. It comes in many code varieties, Java, Python, Ruby, Go, etc.
Originally, the Java version was designed to be deployed and run on Google's platform as a service, AppEngine (AE). After a while, a new version of the Java quick start was introduced, one that ran without the requirement of AppEngine - using maven for library dependency management and jetty, a sleek java based servlet runner, this new maven/jetty combo strips down what's needed to the bare minimum so developers can focus more on the code rather than on how to instrument or scaffold running on a PaaS. That's all well and good, until someone wants to run it on AppEngine. There're some good reasons to run on AppEngine (or another PaaS with similar functionality) - free hosting tier, easy deployment, and SSL for the webapp. SSL's important, because the Mirror API requires SSL for subscriptions - location and other updates from Glass.
There're a four things that need to be done to modify the existing Java Quick Start to deploy on AppEngine: Add an appengine-web.xml; Add a logging.properties file (optional); Modify the maven pom.xml file; and Change one source file.
The Java Quick Start reads a local properties file to get the API console client id and secret needed to authenticate with the API services. When deploying to AppEngine, the code can't reference source paths, it has to reference resource paths relative to the classloader. In the AuthUtil.java file, we commented out the reference to the file path and added in obtaining the file from a resource path.
private static final Logger LOG = Logger.getLogger(AuthUtil.class.getSimpleName()); /** * Creates and returns a new {@link AuthorizationCodeFlow} for this app. */ public static AuthorizationCodeFlow newAuthorizationCodeFlow() throws IOException { //FileInputStream authPropertiesStream = new FileInputStream("./src/main/resources/oauth.properties"); URL resource = AuthUtil.class.getResource("/oauth.properties"); File propertiesFile = new File("./src/main/resources/oauth.properties"); try { propertiesFile = new File(resource.toURI()); //LOG.info("Able to find oauth properties from file."); } catch (URISyntaxException e) { LOG.info(e.toString()); LOG.info("Using default source path."); } FileInputStream authPropertiesStream = new FileInputStream(propertiesFile); Properties authProperties = new Properties(); authProperties.load(authPropertiesStream);
Once that's done, we can now use two local methods of starting a server mvn jetty:run (the default from the quick start), mvn appengine:devserver (a local appengine development server) or mvn appengine:update - to push the Java Quick Start code to AppEngine.