Showing posts with label OSGi. Show all posts
Showing posts with label OSGi. Show all posts

Tuesday, March 1, 2016

Adding Aspects to OSGi Services with no Bytecode Weaving

In my previous post I looked at how easy it is to apply the 'branch by abstraction' pattern to OSGi services, where the Service API is the abstraction layer and you can branch at runtime without even taking down the service client.
Another part of the story here is that it can be useful to add aspects to an implementation. This can also be done using the branch by abstraction pattern and in OSGi you can add these aspects to existing services without the need to modify them, and also without the need for bytecode manipulation. In OSGi you can proxy services by hiding the original and placing a proxy service in the service registry that consumers will use instead. This makes it possible to add a chain of aspects to the service invocations without the need to either change the clients nor the services themselves.

To illustrate, I started a little project called service-spector that allows you to add service aspects to any kind of service. The aspects themselves are also implemented as services. Here's what a ServiceAspect looks like:

public interface ServiceAspect {
  String announce();
  void preServiceInvoke(ServiceReference service, Method method, Object[] args);
  void postServiceInvoke(ServiceReference service, Method method, Object[] args, Object result);
  String report();
}

Aspects are called before and after a service invocation is done via preServiceInvoke()/postServiceInvoke(). They are called with a reference to the service being invoked and with the parameters that are provided to the service. The post call is also provided with the result of the invocation. The aspect API allows you to do things like obtaining metrics for services, add logging to existing service calls or anything else really that an aspect could do. Additionally it contains annouce() and report() to present information to the user.

The service-spector is configured using Configuration Admin with a list of filters for services that need have the aspects applied in the service.filters property. There is an additional configuration item that states whether the original services need to be hidden. The proxies are automatically registered with a higher service ranking than the original service. If the consumer of the original only uses a single service the original does not need to be hidden as the one with the higher ranking will be selected. However, if a consumer uses all the services of a certain type you do want to hide the originals. In general it's safer to hide the original service, but if you want to run without the HidingHooks you can switch this off.

I am using the Felix File Install to provide the configuration and have it stored in a file in load/org.coderthoughts.service.spector.impl.ServiceSpector.config A little known feature of File Install is that is supports typed configuration data in .config files, so the following file contains a Boolean hide.services property and a String[] service.filters property.

hide.services=B"true"
service.filters=["(objectClass\=org.coderthoughts.primes.service.PrimeNumberService)",
    "(objectClass\=org.apache.felix.bundlerepository.RepositoryAdmin)"]

In the ServiceSpector main component I can easily consume this configuration using the new Declarative Services annotation based approach:

@Component(immediate=true)
public class ServiceSpector implements ServiceListener {
  // This API is populated by Config Admin with configuration information
  @interface Config {
    String [] service_filters();
    boolean hide_services() default false;
  }

...

  @Activate
  private synchronized void activate(BundleContext bc, Config cfg) {
    ...
    System.out.println("Service filters: " +
      Arrays.toString(cfg.service_filters()));
    System.out.println("Hide services: " + cfg.hide_services());
  }

The Config annotation is automatically populated with the configuration information by matching the property keys with the annotation methods, which I can easily read out using its typed API (the dots in configuration keys are mangled to underscores).

Service-spector has a Service Listener that looks for services appearing and checks if they need to have aspects applied. If so they will be proxied. Then, when the proxied service is invoked, all the ServiceAspect services are called to let them do their thing. For example the CountingServiceAspect simply counts how often a certain API is invoked: 

@Component(scope=ServiceScope.PROTOTYPE)
public class CountingServiceAspect implements ServiceAspect {
  Map<String, LongAdder> invocationCounts = new ConcurrentHashMap<>();

  @Override
  public String announce() {
    return "Invocation counting started.\n";
  }

  @Override
  public void preServiceInvoke(ServiceReference service, Method method, Object[] args) {
    Class declaringClass = method.getDeclaringClass();
    String key = declaringClass.getSimpleName() + "#" + method.getName();
    invocationCounts.computeIfAbsent(key, k -> new LongAdder()).increment();
  }

  @Override
  public void postServiceInvoke(ServiceReference service, Method method, Object[] args, Object result){
    // nothing to do
  }

  @Override
  public String report() {
    StringBuilder sb = new StringBuilder();
    sb.append("Invocation counts\n");
    sb.append("=================\n");
    for (Map.Entry<String, LongAdder> entry : invocationCounts.entrySet()) {
      sb.append(entry.getKey() + ": " + entry.getValue() + "\n");
    }
    return sb.toString();
  }
}


The CountingServiceAspect is included in the Service Spector bundle. Lets add this and the File Install and Configuration Admin bundles to the framework setup used in the previous post that computes primes. I now have the following bundles installed. I removed bundle 9, the incorrect prime number generator that returns only 1's for now.

g! lb
START LEVEL 1
ID|State     |Level|Name
 0|Active    | 0|System Bundle (5.4.0)|5.4.0
 1|Active    | 1|Apache Felix Bundle Repository (2.0.6)|2.0.6
 2|Active    | 1|Apache Felix Gogo Command (0.16.0)|0.16.0
 3|Active    | 1|Apache Felix Gogo Runtime (0.16.2)|0.16.2
 4|Active    | 1|Apache Felix Gogo Shell (0.10.0)|0.10.0
 5|Active    | 1|Apache Felix Declarative Services (2.0.2)|2.0.2
 6|Active    | 1|service (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT
 7|Active    | 1|impl (1.0.1.SNAPSHOT)|1.0.1.SNAPSHOT
 8|Installed | 1|client (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT
10|Active    | 1|Apache Felix Configuration Admin Service (1.8.8)|1.8.8
11|Active    | 1|Apache Felix File Install (3.5.2)|3.5.2
12|Active    | 1|service-spector (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT


Additionally I have the configuration file from above in the ./load/org.coderthoughts.service.spector.impl.ServiceSpector.config location. This is the default location where the Felix File Install looks. When I start the service-spector bundle it reports the active aspects and its configuration:

g! start 12
Invocation counting started.

Service Spector
===============
Service filters: [(objectClass=org.coderthoughts.primes.service.PrimeNumberService), (objectClass=org.apache.felix.bundlerepository.RepositoryAdmin)]
Hide services: true


Ok so let's start our client:
g! start 8
Services now: $Proxy6(PrimeNumbers)
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
g! stop 8


The version of primes on master has a PrimeServiceReporter that reports the name and the classname of the service. Here we can see that the service calls itself "PrimeNumbers" but that the actual service instance is a Proxy.

As our filters state that we are also proxying the Repository Admin which comes with the Felix implementation we can also exercise that, for example via the shell commands:
 

g! obr:repos list
... some output ...
g! obr:list
... some output
 

Ok so we've now used both of the services that are being proxied. Let's see what APIs have been utilized. The service spector calls report() on all the ServiceAspects when the bundle is being stopped:

g! stop 12
Invocation counts
=================
PrimeNumberService#getServiceName: 1
PrimeNumberService#nextPrime: 30
RepositoryAdmin#listRepositories: 1
RepositoryAdmin#discoverResources: 2

Contributing more aspects

Aspects are contributed as services themselves so they don't need to be part of the service-spector bundle itself. The service-spector-method-timer bundle is a separate bundle that contributes a ServiceAspect that provides average method invocations times.

Get the binaries

The code in this blog post can easily be built using maven, but if you just want the binaries, you can get them from here:
service-spector: https://github.com/coderthoughts/service-spector/releases
method-timer: https://github.com/coderthoughts/service-spector-method-timer/releases

Monday, February 8, 2016

Branch by Abstraction and OSGi

Inspired by my friend Philipp Suter who pointed me at this wired article http://www.wired.com/2016/02/rebuilding-modern-software-is-like-rebuilding-the-bay-bridge which relates to Martin Fowler's Branch by Abstraction I was thinking: how would this work in an OSGi context?

Leaving aside the remote nature of the problem for the moment, let's focus on the pure API aspect here. Whether remote or not really orthogonal... I'll work through this with example code that can be found here: https://github.com/coderthoughts/primes

Let's say you have an implementation to compute prime numbers:
public class PrimeNumbers {
  public int nextPrime(int n) {
    // computes next prime after n - see
https://github.com/coderthoughts/primes details
    return p;
  }
}
And a client program that regularly uses the prime number generator. I have chosen a client that runs in a loop to reflect a long-running program, similar to a long-running process communicating with a microservice:
public class PrimeClient {
  private PrimeNumbers primeGenerator = new PrimeNumbers();
  private void start() {
    new Thread(() -> {
      while (true) {
        System.out.print("First 10 primes: ");
        for (int i=0, p=1; i<10; i++) {
          if (i > 0) System.out.print(", ");
          p = primeGenerator.nextPrime(p);
          System.out.print(p);
        }
        System.out.println();
        try { Thread.sleep(1000); } catch (InterruptedException ie) {}
      }
    }).start();
  }
 
  public static void main(String[] args) {
    new PrimeClient().start();
  }
}
If you have the source code cloned or forked using git, you can run this example easily by checking out the stage1 branch and using Maven:
.../primes> git checkout stage1
.../primes> mvn clean install
... maven output
[INFO] ------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------
Then run it from the client submodule:
.../primes/client> mvn exec:java -Dexec.mainClass=\
org.coderthoughts.primes.client.PrimeClient
... maven output
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
... and so on ...
Ok so our system works. It keeps printing out prime numbers, but as you can see there is a bug in the output. We also want to replace it in the future with another implementation. This is what the Branch by Abstraction Pattern is about.

In this post I will look at how to do this with OSGi Services. OSGi Services are just POJOs registered in the OSGi Service Registry. OSGi Services are dynamic, they can come and go, and OSGi Service Consumers dynamically react to these changes, as well see. In the following few steps will change the implementation to an OSGi Service. Then we'll update the service at runtime to fix the bug above, without even stopping the service consumer. Finally we can replace the service implementation with a completely different implementation, also without even stopping the client.

Turn the application into OSGi bundles

We'll start by turning the program into an OSGi program that contains 2 bundles: the client bundle and the impl bundle. We'll use the Apache Felix OSGi Framework and also use OSGi Declarative Services which provides a nice dependency injection model to work with OSGi Services.

You can see all this on the git branch called stage2:
.../primes> git checkout stage2
.../primes> mvn clean install
The Client code is quite similar to the original client, except that it now contains some annotations to instruct DS to start and stop it. Also the PrimeNumbers class is now injected instead of directly constructed via the @Reference annotation. The greedy policyOption instructs the injector to re-inject if a better match becomes available:
@Component
public class PrimeClient {
  @Reference(policyOption=ReferencePolicyOption.GREEDY)
  private PrimeNumbers primeGenerator;
  private volatile boolean keepRunning = false;
 
  @Activate
  private void start() {
    keepRunning = true;
    new Thread(() -> {
      while (keepRunning) {
        System.out.print("First 10 primes: ");
        for (int i=0, p=1; i<10; i++) {
          if (i > 0) System.out.print(", ");
          p = primeGenerator.nextPrime(p);
          System.out.print(p);
        }
        System.out.println();
        try { Thread.sleep(1000); } catch (InterruptedException ie) {}
      }
    }).start();
  }
 
  @Deactivate
  private void stop() {
    keepRunning = false;
  }
}
The prime generator implementation code is the same except for an added annotation. We register the implementation class in the Service Registry so that it can be injected into the client:
@Component(service=PrimeNumbers.class)
public class PrimeNumbers {
  public int nextPrime(int n) {
    // computes next prime after n
    return p;
  }
}
As its now an OSGi application, we run it in an OSGi Framework. I'm using the Apache Felix Framework version 5.4.0, but any other OSGi R6 compliant framework will do.
> java -jar bin/felix.jar
g! start http://www.eu.apache.org/dist/felix/org.apache.felix.scr-2.0.2.jar
g! start file:/.../clones/primes/impl/target/impl-0.1.0-SNAPSHOT.jar
g! install file:/.../clones/primes/client/target/client-0.1.0-SNAPSHOT.jar
Now you should have everything installed that you need:
g! lb
START LEVEL 1
ID|State |Level|Name
0|Active | 0|System Bundle (5.4.0)|5.4.0
1|Active | 1|Apache Felix Bundle Repository (2.0.6)|2.0.6
2|Active | 1|Apache Felix Gogo Command (0.16.0)|0.16.0
3|Active | 1|Apache Felix Gogo Runtime (0.16.2)|0.16.2
4|Active | 1|Apache Felix Gogo Shell (0.10.0)|0.10.0
5|Active | 1|Apache Felix Declarative Services (2.0.2)|2.0.2
6|Active | 1|impl (0.1.0.SNAPSHOT)|0.1.0.SNAPSHOT
7|Installed | 1|client (0.1.0.SNAPSHOT)|0.1.0.SNAPSHOT
We can start the client bundle:
g! start 7
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
... and so on ..
You can now also stop the client:
g! stop 7
Great - our OSGi bundles work :)
Now we'll do what Martin Fowler calls creating the abstraction layer.

Introduce the Abstraction Layer: the OSGi Service

Go to the branch stage3 for the code:
.../primes> git checkout stage3
.../primes> mvn clean install
The abstraction layer for the Branch by Abstraction pattern is provided by an interface that we'll use as a service interface. This interface is in a new maven module that creates the service OSGi bundle.
public interface PrimeNumberService {
    int nextPrime(int n);
}
Well turn our Prime Number generator into an OSGi Service. The only difference here is that our PrimeNumbers implementation now implements the PrimeNumberService interface. Also the @Component annotation does not need to declare the service in this case as the component implements an interface it will automatically be registered as a service under that interface:
@Component
public class PrimeNumbers implements PrimeNumberService {
    public int nextPrime(int n) {
      // computes next prime after n
      return p;
    }
}
Run everything in the OSGi framework. The result is still the same but now the client is using the OSGi Service:
g! lb
START LEVEL 1
   ID|State      |Level|Name
    0|Active     |    0|System Bundle (5.4.0)|5.4.0
    1|Active     |    1|Apache Felix Bundle Repository (2.0.6)|2.0.6
    2|Active     |    1|Apache Felix Gogo Command (0.16.0)|0.16.0
    3|Active     |    1|Apache Felix Gogo Runtime (0.16.2)|0.16.2
    4|Active     |    1|Apache Felix Gogo Shell (0.10.0)|0.10.0
    5|Active     |    1|Apache Felix Declarative Services (2.0.2)|2.0.2
    6|Active     |    1|service (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT
    7|Active     |    1|impl (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT
    8|Resolved  |    1|client (1.0.0.SNAPSHOT)|1.0.0.SNAPSHOT
g! start 8
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
You can introspect your bundles too and see that the client is indeed wired to the service provided by the service implementation:
g! inspect cap * 7
org.coderthoughts.primes.impl [7] provides:
-------------------------------------------
...
service; org.coderthoughts.primes.service.PrimeNumberService with properties:
   component.id = 0
   component.name = org.coderthoughts.primes.impl.PrimeNumbers
   service.bundleid = 7
   service.id = 22
   service.scope = bundle
   Used by:
      org.coderthoughts.primes.client [8]
Great - now we can finally fix that annoying bug in the service implementation: that it missed 2 as a prime! While we're doing this we'll just keep the bundles in the framework running...

Fix the bug in the implementation whitout stopping the client

The prime number generator is fixed in the code in stage4:
.../primes> git checkout stage4
.../primes> mvn clean install
It's a small change to the impl bundle. The service interface and the client remain unchanged. Let's update our running application with the fixed bundle:
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
First 10 primes: 3, 5, 7, 11, 13, 17, 19, 23, 29, 31
g! update 7 file:/.../clones/primes/impl/target/impl-1.0.1-SNAPSHOT.jar
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
Great - finally our service is fixed! And notice that the client did not need to be restarted! The  DS injection, via the @Reference annotation, handles all of the dynamics for us! The client code simply uses the service as a POJO.

The branch: change to an entirely different service implementation without client restart

Being able to fix a service without even restarting its users is already immensely useful, but we can go even further. I can write an entirely new and different service implementation and migrate the client to use that without restarting the client, using the same mechanism. 

This code is on the branch stage5 and contains a new bundle impl2 that provides an implementation of the PrimeNumberService that always returns 1. 
.../primes> git checkout stage5
.../primes> mvn clean install
While the impl2 implementation obviously does not produce correct prime numbers, it does show how you can completely change the implementation. In the real world a totally different implementation could be working with a different back-end, a new algorithm, a service migrated from a different department etc...

Or alternatively you could do a façade service implementation that round-robins across a number of back-end services or selects a backing service based on the features that the client should be getting.
In the end the solution will always end up being an alternative Service in the service registry that the client can dynamically switch to.

So let's start that new service implementation:
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
g! start file:/.../clones/primes/impl2/target/impl2-1.0.0-SNAPSHOT.jar
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
g! stop 7
First 10 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29
First 10 primes: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
First 10 primes: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
Above you can see that when you install and start the new bundle initially nothing will happen. At this point both services are installed at the same time. The client is still bound to the original service as its still there and there is no reason to rebind, the new service is no better match than the original. But when the bundle that provides the initial service is stopped (bundle 7) the client switches over to the implementation that always returns 1. This switchover could happen at any point, even halfway thought the production of the list, so you might even be lucky enough to see something like:
First 10 primes: 2, 3, 5, 7, 11, 13, 1, 1, 1, 1
I hope I have shown that OSGi services provide an excellent mechanism to implement the Branch by Abstraction pattern and even provide the possibility to do the switching between suppliers without stopping the client!

In the next post I'll show how we can add aspects to our services, still without modifying or even restarting the client. These can be useful for debugging, tracking or measuring how a service is used.

PS - Oh, and on the remote thing, this will work just as well locally or Remote. Use OSGi Remote Services to turn your local service into a remote one... For available Remote Services implementations see https://en.wikipedia.org/wiki/OSGi_Specification_Implementations#100:_Remote_Services

With thanks to Carsten Ziegeler for reviewing and providing additional ideas.

Wednesday, May 28, 2014

Running bndtools OSGi builds with Maven - bnd-maven-plugin now released!

The bndtools.org project provides some really great tooling for building OSGi bundles. This includes help for generating the bundle manifest, templates for working with Declarative Services, testing support and more. There's tooling for resolving deployments against repositories, and last but certainly not least - automatic semantic versioning. The tooling for semantic versioning will tell you whether the version of your exported packages is correct, by comparing the content of your APIs with the previous release of the bundle.
I won't go into the full feature set of bndtools here, have a look at the bndtools website for all the goodness. Another thing to say about bndtools is that it has a really active community, which is always a good thing to know when you're deciding on using an opensource project.

Example IDE screen, taken from bndtools.org
The bndtools project itself focuses on providing an Eclipse-based graphical IDE for developing OSGi bundles. However you normally also want to be able to build your bundles in a headless build. For example as part of a Jenkins/Hudson process or simply to build it from the command-line yourself.
The bndtools project provides headless build options using Ant and Gradle, but for many people being able to use Maven for builds is essential.

Most OSGi development in Maven is done these days using the maven-bundle-plugin or using Eclipse Tycho. Tycho focuses on building OSGi bundles using a manifest-first approach which is supported by the Eclipse PDE tooling. 
The maven-bundle-plugin uses the pom.xml file as the source of information for the bundle and while it uses bnd under the covers, just like the bndtools project, maven-bundle-plugin is a little different in that it uses the pom.xml as the source of information for the build. The bndtools IDE uses bnd.bnd files for this.

This is where the bnd-maven-plugin comes in. Toni Menzel started it a while ago, Peter Kriens worked on it, and I have been putting bits and pieces in recently. The bnd-maven-plugin basically builds a bndtools project exactly like the bndtools IDE would build it, but then from Maven. As of today the first version of the bnd-maven-plugin  (called 1.0.2 - mea culpa ;) has been released to Maven Central, so you can easily use it in your Maven builds.

Precise instructions on how to use it can be found in the project readme, but you can easily try it out by looking at the bnd-maven-plugin samples in github. They show what your pom should look like. The samples contain the following test projects:
TestBundle - this is a simple default-layout bndtools project that can be built with Maven.
TestBundle2 - a bndtools project that follows Maven conventions for directory locations of source and output files. It can be built using Maven and the bndtools IDE. This project also contains an example unit test.
TestBundle3 - this project contains an OSGi Framework integration test, which can be run both from Maven and Bndtools.
Additionally, the sample structure contains
ParentPom - a module that contains a parent pom used by all the other submodules.
MBPBundle - a module that builds a bundle using the maven-bundle-plugin, showing that it can be used alongside the bnd-maven-plugin, although within a single module you need to choose one or the other.

To run it with Maven you can simply go into the parent directory and execute mvn install:
$ .../samples/sample-projects $ mvn install
... lots of maven output ...
[INFO] --- maven-surefire-plugin:2.17:test (default-test) @ TestBundle2 ---
[INFO] Surefire report directory: /Users/David/clones/bnd/bnd-maven-plugin-parent/samples/sample-projects/TestBundle2/target/surefire-reports
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.foo.bar.UnitTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.003 sec - in org.foo.bar.UnitTest
Results :
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
...
[INFO] --- bnd-maven-plugin:1.0.2:integration-test (default-integration-test) @ TestBundle3 ---
[INFO] Running the Bnd OSGi integration tests
Tests run  : 1
Passed     : 1
Errors     : 0
Failures   : 0
No Errors
...
[INFO] Finished running the Bnd OSGi integration tests
[INFO] ----------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] ParentPom ......................................... SUCCESS [0.185s]
[INFO] TestBundle ........................................ SUCCESS [4.134s]
[INFO] TestBundle2 ....................................... SUCCESS [0.361s]
[INFO] TestBundle3 ....................................... SUCCESS [1.041s]
[INFO] MBPBundle ......................................... SUCCESS [0.842s]
[INFO] sample-projects ................................... SUCCESS [0.041s]
[INFO] ----------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ----------------------------------------------------------------------
[INFO] Total time: 11.707s

The above has built all modules, ran the unit tests in TestBundle2 and the integration tests in TestBundle3.

Use them with the bndtools IDE!

As the TestBundle, TestBundle2 and TestBundle3 projects are bndtools Eclipse projects, they can be imported via File -> Import -> Existing Projects into Workspace (along with the bnd cnf project):

Then you can use the IDE for all your dev needs, e.g. to execute the OSGi integration tests:


This is only the first release, but should be good enough to start playing with it. See if it works for you. Give us feedback! Help developing it further! You can fork, file issues and patches in the bndtools/bnd project on github. Enjoy :)

Monday, January 13, 2014

OSGi Subsytems on Apache Felix

Note: more up-to-date information on getting started with Apache Aries Subsystems can be found here: http://aries.apache.org/modules/subsystems.html

Last year I blogged about running the OSGi Subsystems implementation from Apache Aries. At the time Equinox was the only OSGi framework that had the full Core R5 implementation supported, which is needed in order to run OSGi Subsystems.

With some recent work on the Apache Felix container, it is now very close to supporting the full OSGi R5 specification. One of this first things that I tried with that is to run the Aries Subsystems implementation on it. And it works :)

OSGi Subsystems are a great way to package a number of OSGi bundles together, for example if you want to distribute an OSGi-based application. Subsystems provide a really convenient deployment model, without compromising the modularity of your application. For more background on Subsystems, see my earlier post about how to use OSGi Subsystems to deploy your applications.

What I'm doing here is take the example from that blog post and run it on Apache Felix with Aries Subsystems.

First of all you need the latest and greatest Felix Framework. I always build it as follows:

  svn co http://svn.apache.org/repos/asf/felix/trunk felix
  cd felix/framework
  mvn install 
  cd ../main
  mvn install 

This will give you a framework runtime in the felix/main folder.
Then add the following bundles to the bundle subdirectory:

All of the bundles above have links behind them where they can be downloaded, except for the Felix Coordinator implementation, as this one is very new. You can just build it from the coordinator directory of your Felix checkout. Or, if you prefer to use released components, you can also use the Equinox implementation org.eclipse.equinox.coordinator_1.1.0.v20120522-1841, but in order to run that on Felix you also need the Equinox supplement bundle org.eclipse.equinox.supplement_1.5.0.v20130812-2109.
Finally, I'm adding my little gogo command bundle as described in my older post, to add subsystem:list, subsystem:install, subsystem:uninstall, subsystem:start and subsystem:stop commands to Gogo. You can also download that as a bundle here: subsystem-gogo-command-1.0.0.jar

Ok, let's start up Felix and look what's inside:
.../felix/main $ java -jar bin/felix 
... some log messages appear ...
____________________________
Welcome to Apache Felix Gogo
g! lb
START LEVEL 1
   ID|State      |Level|Name
    0|Active     |    0|System Bundle (4.3.0.SNAPSHOT)
    1|Active     |    1|Apache Aries Application API (1.0.0)
    2|Active     |    1|Apache Aries Application Modelling (1.0.0)
    3|Active     |    1|Apache Aries Application Utils (1.0.0)
    4|Active     |    1|Apache Aries Blueprint Bundle (1.1.0)
    5|Active     |    1|Apache Aries Proxy Bundle (1.0.1)
    6|Active     |    1|Apache Aries Subsystem API (1.0.0)
    7|Active     |    1|Apache Aries Subsystem Core (1.0.0)
    8|Active     |    1|Apache Aries Util (1.1.0)
    9|Active     |    1|Apache Felix Bundle Repository (1.6.6)
   10|Active     |    1|Apache Felix Configuration Admin Service (1.8.0)
   11|Active     |    1|Apache Felix Coordinator Service (0.0.1.SNAPSHOT)
   12|Active     |    1|Apache Felix Gogo Command (0.12.0)
   13|Active     |    1|Apache Felix Gogo Runtime (0.10.0)
   14|Active     |    1|Apache Felix Gogo Shell (0.10.0)
   15|Active     |    1|Apache Felix Resolver (1.0.0)
   16|Active     |    1|Region Digraph (1.1.0.v20120522-1841)
   17|Active     |    1|slf4j-api (1.7.5)
   18|Resolved   |    1|slf4j-simple (1.7.5)
   19|Active     |    1|Subsystem Gogo Command (1.0.0)
   20|Active     |    1|org.osgi.service.subsystem.region.context.0 (1.0.0)

g! subsystem:list
0 ACTIVE org.osgi.service.subsystem.root

Looking at the bundles, you can see the Gogo command line ones that come with Felix, as well as its Bundle Repo. You can also see bundle 20, which is a synthesised bundle that the Subsystems implementation added and which represents the root subsystem. The subsystem:list command reports that there is one subsystem: the root one.

Now let's look at my example applications again. I have the following two subsystems:


Subsystems are a great way to distribute multi-bundle applications, and my two subsystems each contain 2 specific bundles as well as a shared bundle.

Let's install then and see what happens:

g! subsystem:install http://coderthoughts.googlecode.com/files/subsystem1.esa
Installing subsystem: http://coderthoughts.googlecode.com/files/subsystem1.esa
Subsystem successfully installed: subsystem1; id: 1
g! subsystem:start 1
g! lb
START LEVEL 1
   ID|State      |Level|Name
...
   21|Active     |    1|SharedBundle (1.0.0)
   22|Active     |    1|BundleA (1.0.0)
   23|Active     |    1|BundleB (1.0.0)
The three bundles needed by subsystem1 are installed and all started with the subsystem:start command.

And let's add the second subsystem...
g! subsystem:install http://coderthoughts.googlecode.com/files/subsystem2.esa
Installing subsystem: http://coderthoughts.googlecode.com/files/subsystem2.esa
Subsystem successfully installed: subsystem2; id: 2
g! subsystem:start 2
g! lb
START LEVEL 1
   ID|State      |Level|Name
...
   21|Active     |    1|SharedBundle (1.0.0)
   22|Active     |    1|BundleA (1.0.0)
   23|Active     |    1|BundleB (1.0.0)
   24|Active     |    1|BundleC (1.0.0)
   25|Active     |    1|BundleD (1.0.0)

As expected, the two new bundles for subsystem2 are now also installed. And, because we're talking about a feature subsystem here, where everything is shared, the SharedBundle (21) is not installed a second time, but rather reused from subsystem1. 

The topic of subsystems is much bigger. Subsystems can provide a certain level of isolation, they can work with Repositories for provisioning and you have a lot of options wrt to what you can put inside an .esa file: you can put all the bundles in there that your application needs, or you can just have a textual descriptor that declares your main bundles and let the OSGi Resolver and Repository find the dependencies and deploy these for you. Details about these various options can be found in the OSGi Subsystem spec (chapter 134 in the OSGi R5 Enterprise spec). 

All in all good news - Subsystems now works in Apache Felix as well. Right now you need to build the latest snapshot, but hopefully we'll have a Felix Framework release for this soon!

Tuesday, October 22, 2013

Role-based access control for Karaf shell commands and OSGi services

In a previous post I outlined how role-based access control was added to JMX in Apache Karaf. While JMX is one way to remotely manage a Karaf instance, another management mechanism is provided via the Karaf Console/Shell. Up until now security for console commands was very coarse-grained. Once in the console you had access to all the commands. For example, it was not possible to give certain users access to merely changing their own configuration without also giving them access to shutting down the whole karaf instance.

With commit r1534467 this has now changed (thanks again to JB Onofré for reviewing and applying my pull request). You can now define roles required for each shell command and even have different roles depending on the arguments used with a certain command. This is achieved by using a relatively advanced feature of OSGi: Service Registry Hooks. These hooks give you a lot of control on how the OSGi service registry behaves. I blogged about them before. They enable you to:
  • see what service consumers are looking for, so you can register these services on-the-fly. This is often used to import remote services from discovery, but only if there is actually a client for them.
  • hide services from certain service consumers
  • change the service properties the client sees for a service by providing an alternative registration
  • proxy the original service
Every Karaf command is in reality an Apache Felix Gogo command, registered as an OSGi service. Every command has two service registration properties: osgi.command.scope and osgi.command.function. These properties define the name of the command and its scope. With the use of the OSGi Service Registry hooks I can replace the original service with a proxy that adds the role-based security. 

When I originally floated this idea on the Karaf mailing list, Christian Schneider said: "why don't we enable this for all services?" Good idea! So that's how I ended up implementing it. I first added a mechanism to add role-based access control to OSGi services in general and then applied this mechanism to get role-based access control for the Karaf commands.

Under the hood

the original service is hidden by OSGi Service Registry Hooks
The theory is quite simple. As mentioned above you can use OSGi Service Registry hooks to hide a service from certain consuming bundles and effectively replace it with another. In my case the replacement is a proxy of the original service with the same service registration properties (and some extra ones, see below). It will delegate an invocation to the original service, but before it does so it will check the ACL for the service being invoked to find out what the permitted roles are. Then it checks the roles of the current user by looking at the Subject in the current AccessControlContext. If the user doesn't have any of the permitted roles the service invocation is aborted with a SecurityException.

How do I configure ACLs for OSGi services?

ACLs for OSGi services are defined in a way similar to how these are defined for JMX access: through the OSGi Configuration Admin service. The PID needs to start with org.apache.karaf.service.acl. but the exact PID value isn't important. The service to which the ACL is matched is found through the service.guard configuration value. Configuration Admin is very flexible wrt to how configuration is stored, but by default in Karaf these configurations are stored as .cfg files in the etc/ directory. Let's say I have a service in my system that implements the following API and is registered in the OSGi service registry under the org.acme.MyAPI interface:
  package org.acme;

  public interface MyAPI {
    void doit(String s);
  }
If I want to specify an ACL to say that only clients that have the manager role can invoke this service, I have to do two things:
  1. First I need to enable the role-based access for this service by including it in the filter specified in the etc/system.properties in the karaf.secured.services property:
      karaf.secured.services=(|(objectClass=org.acme.MyAPI)(...what was there already...))
    only services matching this property are enabled for role-based access control. Other services are left alone.
  2. Define the ACL for this service as Config Admin configuration. For example by creating a file etc/org.apache.karaf.service.acl.myapi.cfg:
      service.guard=(objectClass=org.acme.MyAPI)
      doit=manager

    So the actual PID of the configuration is not really important, as long as it starts with the prefix. The service it applies to is then selected by matching the filter in the service.guard property.
There are some additional rules. There is a special role of * which means that ACLs are disabled for this method. Similar to the JMX ACLs you can also specify function arguments that require specific roles. For more details see the commit message.

Setting roles for service invocation

The service proxy checks the roles in the current AccessControlContext against the required ones. So when invoking a service that has role-based access control enabled, you need to set these roles. This is normally done as follows:
  import javax.security.auth.Subject;
  import org.apache.karaf.jaas.boot.principal.RolePrincipal;
  // ... 
  Subject s = new Subject();
  s.getPrincipals().add(new RolePrincipal("manager"));
  Subject.doAs(s, new PrivilegedAction() {
    public Object run() {
      svc.doit("foo"); // invoke the service
    }
  }
This example uses a Karaf built-in role. You can also use your own role implementations by specifying them using the className:roleName syntax in the ACL.

Note however, that javax.security.auth.Subject is a very powerful API. You should give bundles that import it extra scrutiny to ensure that they don't give access to clients that they shouldn't really have...

Applied to Shell Commands

Next step was to apply these concepts to the Karaf shell commands. As all the shell commands are registered with the osgi.command.function and osgi.command.scope properties. I enabled them in the default Karaf configuration with the following system property:
  karaf.secured.services=(&(osgi.command.scope=*)(osgi.command.function=*))

The next thing is to configure command ACLs. However that presented a slight usability problem. Most of the services in Karaf are implemented (via OSGi Blueprint) using the Function interface. Which means that the actual method name is always execute. It also means that you need to create a separate Configuration Admin PID for each command which is quite cumbersome. You really want to configure this stuff on a per-scope level with all the commands for a single scope in a single configuration file. To allow this the command-integration code contains a configuration transformer which creates service ACLs as described above but based on command scope level configuration files.
The command scope configuration file must have a PID that follows this structure: org.apache.karaf.command.acl.<scope> So if you want to create such a file for the feature scope the config file would be in etc/org.apache.karaf.command.acl.feature.cfg:
  list = viewer
  info = viewer
  install = admin
  uninstall = admin
In this example only users with the admin role can do install/uninstall operations while viewers can list features etc... Note that by using groups (as outlined in this previous post) users added to an admin group will also have viewer permissions, so will be able to do everything. For a more complex configuration file, have a look at this one.

Can I find out what roles a service requires?

It can be useful to know in advance what the roles are that are required to invoke a service. For example the shell supports tab-style command completion. You don't want to show commands to the user that you know are not available to the user's roles. For this purpose an additional service registration property is added to the proxy service registration: org.apache.karaf.service.guard.roles=[role1,role2]. The value of the property is the Collection of roles that can possibly invoke a method on the service. Since each command maps to a single service, we can have a Command Processor that only selects the commands applicable to the roles of the current user. This means that commands that this user doesn't have the right roles for are automatically hidden from autocompletion etc. When I'm logged in as an admin I can see all the feature commands (I removed ones not mentioned in the config for brevity):
  karaf@root()> feature: <hit TAB>
  info            install         list            uninstall
while Joe, a viewer, only see the feature commands available to viewers:
  joe@root()> feature: <hit TAB>
  info            list

In some cases the commands have roles associated with particular values being passed in. For example the config admin shell commands require admin rights for certain PIDs but not all. So Joe can safely edit his own configuration but is prevented from editing system level configuration:
  joe@root(config)> edit org.acme.foo
  joe@root(config)> property-set somekey someval
  joe@root(config)> update
So Joe can edit the org.acme.foo PID, but when he tries to edit the jmx.acl PID access is denied:
  joe@root(config)> edit jmx.acl
  Error executing command: Insufficient credentials.

Where are we with this stuff today?

The first commits to enable the above have just gone into Karaf trunk and although I wrote lots of unit tests for it, more use is needed to see whether it all works as users would expect. Also the default ACL configuration files may need a bit more attention. What's there now is really a start, the idea is to refine as we go along and have this as a proper feature for Karaf 3.

The power of OSGi services

One thing that this approach shows is really the power and flexibility of OSGi services. None of the code of the actual commands was changed. The ability to build role-based access on top of them in a non-intrusive way was really enabled by the OSGi service registry design and its capabilities.

Thursday, October 3, 2013

JMX role-based access control for Karaf

Recently I worked on adding role-based access control to Karaf management operations. This work is split into two parts: one part focuses on adding role-based access to JMX. Another part focuses on the Karaf shell/console. In this post I'm looking at how JMX access is secured.

JMX plays an important role in Karaf as a remote management mechanism. A number of management clients are built on top JMX, hawtio being probably the most popular one right now. While hawtio uses JMX through Jolokia which exposes the JMX API over a REST interface, other clients use JMX locally (e.g. via JConsole) or over a remote connector. 

Most functionality available in Karaf can be managed via MBeans, but up until now it was suffering from one issue, there was really only one level of access. If you were given rights to access, you had access to all the MBeans. It was not possible to give users access to certain areas in JMX while restricting access to other areas.


Role-based Access Control

With commit r1528587 my JMX role-based access has been added to Karaf trunk (extra kudos and thanks to Jean-Baptiste Onofré for additional testing, finding a number of bugs, fixing those and actually applying the commits!). It means that an administrator can now declare the roles required to access certain Karaf MBeans. And, it also applies to MBeans that are registered outside of Karaf, but running in the same MBeans server. So JRE-provided MBeans and MBeans coming from OSGi bundles that are installed on top of Karaf are also covered.

How does it work?

It works by inserting a JMX Guard which is configured via a JVM-wide MBeanServerBuilder. The Karaf launching scrips are updated to contain the following argument: -Djavax.management.builder.initial=org.apache.karaf.management.boot.KarafMBeanServerBuilder
This global JVM-level MBeanServerBuilder calls into an OSGi bundle that contains the JMX Guard for each JMX invocation made. The Guard in turn looks up the ACL of the accessed MBean in the OSGi Configuration Admin Service and checks the required roles for this MBean with the RolePrincipal objects present in the Subject in the current AccessControlContext. If no matching role is present, the JMX invocation will be blocked with a SecurityException.

How can I define my ACLs?

The Access Control Lists are stored in OSGi Configuration Admin. This means that they can be defined in whatever way the currently configured Config Admin implementation stores its information, which could be a database, nosql, etc... In the case of Karaf this information is normally stored in the etc/ directory in .cfg text files. The file name (excluding the .cfg extension) represents the Config Admin PID. JMX ACLs are mapped to Config Admin PIDs by prefixing them with jmx.acl. Then the Object Name as it appears in the JConsole tree is used to identify the MBean. So the ActiveMQ QueueA MBean as in the screenshot below would map to the PID jmx.acl.org.apache.activemq.Broker.amq-broker.Queue.QueueA
The 'purge' operation is denied if the user does not have the required role
However, having to write a configuration file for every MBean isn't really that user-friendly. It would be nice if we could define this stuff on a slightly higher level. Therefore the code that looks for the ACL PIDs follows a hierarchical approach. If it cannot find any matching definitions for the operation invoked on ...QueueA PID, it goes up in the tree and looks for definitions in jmx.acl.org.apache.activemq.Broker.amq-broker.Queue and then jmx.acl.org.apache.activemq.Broker.amq-broker and so on. So if you want to specify an ACL for all queues on all ActiveMQ brokers you could do this in the jmx.acl.org.apache.activemq.Broker.cfg file. For example:
  browse*          = manager, viewer
  getMessage       = manager, viewer
  purge            = admin
  remove*          = admin
  copy*            = manager
  sendTextMessage* = manager
Note that this example uses wildcards for method names, so browse* covers browse(), browseAsTable() and browseMessages(). Additionally even though the admin role has access to all APIs it's not explicitly listed everywhere. This is not because the admin role is special, this is because administrators are expected to be part of the admingroup, which has all the roles in the system.

Groups

To keep the ACLs manageable I used the concept of JAAS groups. Typically you want to give an administrator access to everything in the system, but it's very cumbersome (and ugly) to add 'admin' to every single ACL definition in the system. Therefore the idea is that an administrator is not directly assigned the admin role, but is rather added to the admingroup. This group then has all the roles defined in the system. And no, it's not magic. If you decide to add a new group then the admingroup needs to be updated. Here's what the definition of some users might look like:
  karaf@root()> jaas:realm-manage --realm karaf
  karaf@root()> jaas:user-list
  User Name | Group        | Role
  ----------------------------------
  karaf     | admingroup   | admin
  karaf     | admingroup   | manager
  karaf     | admingroup   | viewer
  joe       | managergroup | manager
  joe       | managergroup | viewer
  mo        |              | viewer

So in this example, the karaf user is in the admingroup and because of that has the roles admin, manager and viewer.

Default Configuration

There is default configuration that applies to any MBean if it doesn't have specific configuration. This can be found at the top of the hierarchy in the jmx.acl.cfg file:
  list* = viewer
  get*  = viewer
  is*   = viewer
  set*  = admin
  *     = admin
So the default is that any operation on any MBean starting with 'list', 'get' or 'is' is assumed to be an operation that you only need the viewer role for, while set* or any other operation name requires the admin role by default. This also maps well to MBeans that define JMX attributes. Obviously these defaults don't apply if a more specific definition for the MBean can be found...

Redefine to suit

While the Karaf distro comes with some predefined configuration in the form of jmx.acl.**.cfg files, it might be possible that this doesn't map 100% to the roles being used in your organization. Therefore all of this can be changed by the administrator. Nothing is hard coded, so feel free to add new roles, new groups and new ACLs to suit your organizational structure.

ACL definition details

The ACL examples in this posting are on the method level, but in some cases you want to define roles based on the arguments being passed into the operation. For example, you might need admin rights to uninstall a karaf system bundle, but maybe the manager role is enough to uninstall other bundles. Therefore you can define roles based on arguments passed in to the JMX operation either as literal arguments or using regular expressions. For more information on this, see the original commit message in github: 

What MBeans can I use?

If you're writing a rich client or other tool over JMX it can be nice to know in advance whether the current user can invoke certain operations or not. It allows the tool to only show the relevant widgets (buttons, menus etc) if it's actually possible to use the associated MBeans. For this use-case I added an MBean org.apache.karaf:type=security,area=jmx that has a number of canInvoke() operations. It allows you to check whether the currently logged in user can invoke any methods on a given MBean at all or whether it can invoke a certain method. There is also a bulk query operation that lets you check a whole bunch of operations in one go. The nice thing about this approach is that the client doesn't need to know anything about how the roles are mapped by the administrator. It simply checks whether the currently logged in user has the appropriate roles for the operations requested. This means that if the administrator decides to revamp the whole role-mapping on the back-end the client console will automatically adapt: no duplication of information or hard-coded role names needed. For more details about the canInvoke() method see: https://github.com/bosschaert/karaf/blob/f793e70612c47d16a95ef12287514c603613f2c0/management/server/src/main/java/org/apache/karaf/management/JMXSecurityMBean.java

Changing permissions at Runtime

As with nearly everything in OSGi, the Configuration Admin service is dynamic, which means that you can change the information at runtime. This means that you can change the role mappings while the system is running and even for a user that is logged in. You can add or take away privileges dynamically, for example if a trusted user is all of a sudden causing havoc, you can remove the rights associated with the roles of that user dynamically and stop any further damage instantly.

What's next?

I am also working on implementing RBAC for Karaf shell/console commands and will write another post about that when available on trunk.