Conference Report: 2014 LISA Advanced Topics Workshop

Tuesday's sessions included the 20th annual and final Advanced Topics Workshop; once again, Adam Moskowitz was our host, moderator, and referee. Unlike past years we only ran for a half day. With only two new participants (both longtime LISA attendees), Adam covered the participant's interface to the moderation software in brief one-on-one sessions over lunch. We started with our usual administrative announcements. We mostly skipped introductions. However, Adam noted that 2 people here were at the first ATW, and he and I were both here for the past 18 years (he as moderator and I as scribe). In representation, businesses (including consultants) outnumbered universities by about 2 to 1 (about the same as last year); over the course of the day, the room included 10 LISA program chairs (past, present, and announced future, up from 5 last year) and 9 past or present members of the LOPSA or USENIX boards.

Our first topic, which took two-thirds of the discussion time, was on why this was the last ATW. To oversimplify:

The workshop has met its originally stated goals of getting more-senior people to attend the conference and to have a space to discuss possibly-confidential issues in a non-public venue with other senior people and without interruption.

Most of the topics we discuss are not controversial and don't lead to much discussion, spirited or otherwise. There were few if any strong opinions.

Many of the topics were repeated year after year but nothing new was being said.

Of course, since this decision was announced without input from the participants, it... generated a very spirited and passionate discussion (and at times an outright debate). That discussion wandered through what the workshop should be if it were to continue, as well as the future direction of the LISA conference itself. No definitive conclusions were reached, in large part because not all stakeholders were present or represented.

It was argued that the workshop has been successful. The founder, John Schimmel, looked at the conference and identified a problem: More-senior system administrators would only come to LISA (which was then more about training junior administrators) if they were speaking or teaching, and were much less likely to come as an attendee. The workshop was an attempted solution to that problem: Get the more-senior sysadmins present for the workshop, where they could have non-public discussions, without having to step down the language for more-junior sysadmins to understand, and they'd be (and were) much more likely to stick around for the rest of the conference.

It was also argued that there's still value in getting together, even if just "at the bar." Many were quick to point out that it would be much more difficult to sell "I'm meeting with a couple of dozen senior sysadmins at the bar" than "...at the workshop" to their management.

Some of the other points we considered during the discussion included:

What problems are there with the conference today, and how can we solve them? In general, we see a need and service to the community; someone therefore needs to present a proposal (for a workshop, talk, tutorial, or whatever is appropriate) and see if it's accepted.

If there was no ATW, would you still attend LISA? If not, why not? How can LISA evolve to change your mind? For many people — nearly half of those present — ATW was the reason they attend LISA, in part because we provide seniority without overspecialization. Also, would the conference be losing something important? (Most thought so.)

Is the workshop name still appropriate? Some years the topics aren't necessrily advanced, but we're talking about implementing and integrating existing technologies into existing (often complex legacy) environments.

A side discussion came up as to whether we're elitist. How many of us sent in a position paper? (Virtually all at least once; it's only required the first time. At least one participant submits a position paper every year.) We need some kind of bar to keep the limited space available for the more-senior people; any bar even if it's "senior vs junior," can be perceived as elitist. Perhaps owning up to the word as a function of the workshop would be a good thing. Some present parsed the message of "the workshop is perceived as elitist" as "USENIX would prefer you weren't here;" the USENIX representatives present disagreed with that parsing.

How do we, as senior sysadmins and often longtime LISA attendees, contribute to the conference?

Why should USENIX fund this (indeed, any) workshop? Is it an appropriate use of USENIX's time and money? By reserving the room for the workshop, which loses money, there's one less room available for tutorials, which make money. That led to a discussion about what LISA should be and what USENIX should do. It was noted that USENIX has to continue to make money, to serve the community, and to serve the conference. If it's not going to be making money but serving another purpose that serves the community, that's still valid. Nobody at USENIX wants the workshop participants to stop coming, but by pulling the plug they may have that affect. Second, the board members present agree that they would welcome discussions with a committee elected to discuss the future of the workshop.

We need something formal on the schedule to justify our attendance and travel to our management.

Some of our extended discussions over the years have spawned their own workshops or conferences (notably the Configuration Management workshop held at LISA and the standalone LISA-NT conference).

This workshop is a microcosm of what happened to the general conference: It spun off other workshops and conferences and made the general conference look less important and less relevant.

This forum of knowledgable and experienced people is a hard resource to replace. Is this workshop, as currently consituted, the right forum for such discussion? If not, what is?

Is the issue with the format or the lack of new blood? We only get 2-3 new people each year requesting to join because the position paper scares many off. That said, many agree we need new blood. One wants everyone to encourage someone new to join us next year, doubling the size of the workshop; unfortunately, workshops need to be size-limited for things to work.

That said, having the same people year after year can lead to an echo chamber. Something not impacting us may be overlooked or ignored. We don't have new people joining. Some of us show up out of habit; this won't be a problem as long as it doesn't drive away people with specific topics to discuss. Perhaps a position paper should be required from every participant every year (instead of only the first year)?

How do we bring, to the conference or to the workshop if it remains, other qualified people, those on the leading edge of technologies?

How can we be better at mentoring leaders?

It was stressed that all interesting proposals (for papers, talks, tutorials, and workshops) are both welcome and desired. If we say "After N years we have a new version of the ATW called something else," with how it'd be different, it would be considered. There is a limit in the number of workshops based on the number of rooms around, and the number of places any one of us can be at one time. It's not just what sould serve USENIX or LISA better but what would serve us (constituents) better.

As a palate cleanser we went with a lightning round: What's your favorite tool? Answers included Aptly, C+11, CSVKit, Chef, Docker, Expensify, Go, Google Docs, Graphana, Graphite, HipChat, JCubed, JIRA, R, Review Board Sensu, Sinatra, Slack, git and git-annex, logstash, and smartphone-based cameras.

Our next discussion was about platform administrators. With user-level networking and systems becoming one blended platform, are platform admins the new sysadmins? Is this a new tier for provisioning logical load balancers and front and back ends? The discussion seemed to be that it's still sysadmin, just a specific focus. It's like any other new technology, and may be due to the extenstion of virtualization into the network world. The "are we specializing" comes up often (for example, storage, network, Windows versus Unix, and so on), and we're still sysadmins.

One participant strongly disagreed, thinking it's fundamentally different in that for the first time it's now readily straightforward and easy to think of system deployment as a cheap software call or RPC. It's so lightweight in so many ways it's fundamentally different than early virtualized environments. His business expects to routinely spin up thousands of virtual instances. How much and how fast to spin things up (and down again) is a game changer. The other part of it is that the environments they're using for this are fundamentally confused about everything of value, with APIs calling APIs. At some level this is sysadmin on a new layer, because it's a programmability block mode; much of the sysadmin stuff is hidden. What happens when you're repairing a cluster and something says you have to scale out from 200 to 1000? Either "You don't" or "You wait" might be the answer.

Another noted that we're systems administrators, not just focused on the single computer (or network, or person), but on the interaction between those systems (computers, networks, people, and so on). Nothing's really changed: We still look at the pieces, the goals, and if it's delivering the product/service as expected.

Two side discussions came out of this as well. First, with virtualization and cloud and *aaS, how many businesses still administer their IT as their core function? Second, sysadmins who won't write code (including shell scripts) will soon be out of a job, since the field is moving towards that: Systems will be built by writing code. With virtualization and APIs, we suspect that most sysadmins will fall into the "services" mode, maintaining services on perhaps-dedicated probably-virtual machines, as opposed to the folks administering the underlying hardware on which the virtualized machines run.

Our next discussion was started with the phrase, "If I had a dollar for everytime someone said DevOps was the future...." It took forever for Agile to get into Gartner, but DevOps is there already and, in the speaker's opinion, has jumped the shark in less than 2 years. DevOps is a horribly abused term, despite being a paradigm shift. At ChefConf, the belief was that DevOps was "software engineers throwing off the yoke of the evil sysadmins have oppressed them for so long." (That's a direct quote from their keynote speaker.) Code needs to be in the realm of infrastructure; what we did 20 years ago won't scale today. There's a huge difference between writing actual code and writing a Ruby file that consists entirely of declarations.

In another company, they have some developers who do sysadmin work as well, but not all developers there have the background and he doesn't trust them to do it: Their sysadmins are developers but not all developers are sysadmins.

One participant has been going to DevOps and infrastructure-as-code meetups for a while now, and says it's like SAGE-AU and Sun Users' Group repeating the same mistakes all over again.

Even now, everyone still has a different definition as to what DevOps means, though most could agree it's not a tool, position, mechanism, or process, but a culture, about having the operations folks and engineers talk to each other as the product is written as well as after operations has it in production. There's a feedback loop through the entire life cycle. But having "a DevOps team" is not true; it's about not isolating teams.

We had a brief conversation on recruiting. How do you find and entice qualified people to jump ship to a new company? They have problems finding candidates who want to come to the company. The only response was that sometimes you simply can't, and one participant noted he turned down a great job because of its location (being sufficiently unpleasant to make it a show-stopper).

We then discussed what tools people are are using to implement things within a cloud infrastructure. One participant is all in AWS, for example. Do you do it manually or through automation, what do you use to track things and manage things and so on? One participant snarked he'd have an answer next year.

Another is about to start moving away from the AWS API to the Terraform library (written in Go), which supports several different cloud vendors and has a modular plug-in system. Beyond that it depends on what you're trying to do.

Yet another says part of this is unanswerable because it depends on the specific environment. His environment is in the middle of trying to deploy OpenStack stoage stuff, and most of the tools can't work because they reflect the architectural confusion thereof. They have used ZeroMQ for monitoring and control due to scalability (to a million servers — which is what they call a medium-sized applciation). Precious few libraries can handle that level. (That's the number thrown around by HPC too.)

Once you care about speed and latency and measurements you can make a better judgement of how much to spin up to handle those requirements and whether physical or virtual is the right answer for your environment.

Our final discussion topic was on getting useful information from monitoring data. One participant loves Graphite. Since he has a new hammer everything looks like a thumb, so he's been trying to get more and more into it... and now that he's taken the stats classes he needs more low-level information so he can draw correlations... and eventually move data out of the system. What are others doing with their statistics? What are you using to gather, store, and analyze data? In general, R and Hadoop are good places to start, and there's an open source project called Imhotep for large-scale analytics. Several others noted they use Graphite as a front end to look at the data. Spark is useful for realtime and streaming. Nanocubes can do real-time manipulation of the visualization of a billion-point data set. Messaging buses discussed include RabbitMQ and ZeroMQ.

How does this help? In one environment, they used the collected metrics to move a data center from Seattle to San Jose and and the 95th percentile improved a lot. Another noted that Apple determined that the transceiver brand makes a huge difference in performance.

We wrapped up with the traditional lightning round asking what we'd be doing in the next year. Answers included an HPC system with 750K cores and an 80PB file system, automation and instrumentation, chainsaws and hunting rifles in Alaska, enabling one's staff, encouraging people to crete and follow processes, exabyte storage, functional programming, Hadoop, home automation, Impala, infrastrucutre, learning a musical instrument, merging an HPC-focused staff into the common IT group, moving from GPFS to something bigger, network neutrality, organizing a street festival and writing the mobile app therefor, packaging and automated builds, producing a common environment across any type of endpoint device, R, scaling product and infrastructure (quadrupling staff), Spark, trying to get the company to focus on managing problems not incidents, and updating the Cloud Operational Maturity Assessment.

Our moderator thanked the participants, past and present, for being the longest-running beta test group for the moderation software. The participants thanked Adam for moderating ATW for the past 18 years.

Back to my conference reports page

Back to my professional organizations page

Back to my work page

Back to my home page

Last update Feb01/20 by Josh Simon (<jss@clock.org>).