Category Archives: Archiva

Apache Archiva 1.3.3 released: performance improvements!

If you’re using Archiva for your repository management needs, you should definitely upgrade to the latest release. Download it now!

The 1.3.x series has focused on the biggest offenders in memory usage and performance problems, and Archiva 1.3.3 brings the biggest improvements yet:

  • Full scans should take about 1/3rd of the time and consume far less memory
  • Removed one-off memory hits at the end of a scan
  • File descriptor use during concurrent deployments are better managed

In addition, a new system status page is available for assessing the cause of potential performance issues at runtime, giving better insight into how to tune memory or scanning settings appropriately.

This work is in advance of the upcoming Archiva 1.4 release which has revived the internals more significantly, with further performance improvements and a series of new features.

It’s also worth noting that we dropped support for Archiva 1.1.x and Archiva 1.2.x in November, so there’s no reason left to remain on older versions.

I’d like to thank YourKit, who provided a free license for their profiler, which was of great assistance in tracking down these issues. I’ve used it on occasion for a number of years, and it is one of the easiest tools to use that I’ve ever encountered.

The full set of issues resolved follow:

  • [MRM-1097] – Error 500 "too many open files"
  • [MRM-1369] – Editing user roles in archiva clobbers continuum redback roles
  • [MRM-1396] – Purge task problem : Not enough parts to the path
  • [MRM-1421] – Archiva repository purge incorrectly purges based on file timestamps even when the snapshot timestamp is known
  • [MRM-1443] – repository statistics collection can cause server to hang
  • [MRM-1416] – upgrade to Redback 1.2.5
  • [MRM-1439] – improve indexing performance
  • [MRM-1440] – system status page
  • [MRM-1441] – monitor repository scanning progress
  • [MRM-1442] – track time spent in each consumer during a scan, to help diagnose poor scanning performance
  • [MRM-1445] – disable referrer check by default

Apache Archiva 1.3 release and what’s next

In the midst of a busy couple of weeks, I neglected to post about the Archiva 1.3 release that was announced recently (and on that topic, Continuum has posted a new beta release as well).

The Archiva release focused mostly on bugfixes (particularly for indexing and LDAP), but we decided it was worthy of a version bump after the addition of an upload audit logging feature and some decent performance improvements. It’s an easy upgrade for 1.2 users – if you keep your configuration separate, then just unzip the new version and start it up using the same environment variables as you would the previous version.

Archiva is also easy to try out if you already have a Maven repository – its primary storage is the filesystem in a Maven repository format, so you can point it at a copy and everything will be available straight away (gradually indexing resources for access through the UI in the background).

As for what’s next – while it’s still to be put to vote, I hope that the next version will be based on the work I started again last year and have had in mind since almost the beginning of the project. This focuses on two underlying aspects: the removal of the archiva database requirement and the transformation to be an extensible metadata repository.

I always refer to the database removal as “Back to the Future”, since it is similar to the design pre-1.0 where Lucene was used to store all of the information, however in this case I had the opportunity to learn from our experiences and build on a more appropriate foundation:

  • A central, extensible metadata model that allows storage of any different repository , artifact, or resource type;
  • Delegating repository requests, to better facilitate repository grouping and proxying when configured, and to allow metadata to be regenerated from the storage on the fly;
  • Decomposing functionality into plugins so that optional portions can be removed. Plugins operate on metadata, certain repository events and a few other extension points. This remains a work in progress, but the aim is to allow reducing the deployable application to as little as a simple maven proxy cache for your local machine with a very low footprint, and to make it easy and robust to write and use a combination of different plugins.

At the moment, these changes are all under the hood – apart from configuration there is no visible difference other than the number of bugs that got removed along the way! However, the decoupling will make way for easier development of new features and the opportunity for much needed advances in the UI.

Maven training in Oakland, November 2

As I’ve blogged previously, I’m gearing up again to present my training session Apache Maven: End-to-end at ApacheCon US in Oakland in just a few weeks now. There are still spots available, so go ahead and register from the ApacheCon US site. Noirin offers some tips on how to justify ApacheCon to your boss.

The training session is hands on – all of the software and material is distributed on CDs and we spend some time digging into Maven and development infrastructure in a variety of ways.

Apart from the standard training material, there is the opportunity to work through some specific questions for your environment if it is something that interests the whole group, and of course those discussions can be continued over the rest of the time at the conference.

Hope to see you there!

Interview with Feathercast about Maven Training

Last week, Rich Bowen from Feathercast interviewed me about the training I’ll be hosting in Oakland on November 2: FeatherCast » Blog Archive » Episode 63: Brett Porter – Maven. More information about the training session can be found on the conference web site or in my previous post.

Apache Maven: End-to-end training in November

It’s that time of year again! ApacheCon US in Oakland is on 2-6 November 2009. There are still discounts for registration by September 25.

I’m presenting my full-day Maven and development infrastructure training again on Monday November 2, called Apache Maven: End-to-end:

This training session will walk through the lifecycle of developing a typical Java application from creation to deployment, and show how to use Apache Maven most effectively to manage the build and development process. In addition to the fundamental building blocks of the project, the session will cover testing, day-to-day development in the IDE, application of Maven best practices, effective dependency management, establishing a release process, using profiles effectively, setting up documentation, tracking development reports and practices. Effective use of continuous integration (illustrated with Apache Continuum) and repository management (using Apache Archiva) as a part of development infrastructure for team and enterprise environments will be demonstrated. This course will be suitable both for those that are looking to get the most out of their existing Maven projects, and those that are looking to use Maven for the first time. Time is reserved for addressing specific situations that attendees have encountered in existing projects.

The material is aimed to offer most to the intermediate Maven user, while still being appropriate for Maven beginners, and is refreshed with the latest work from the book.