The following document is intended as the general trip report for me at the 19th Systems Administration Conference (LISA 2005) in San Diego, CA from December 4-9, 2005. It is going to a variety of audiences, so feel free to skip the parts that don't concern you.
Today was my travel day. Headed off to the airport where we didn't take off until over an hour past our scheduled time thanks to a long wait on de-icing (it was 15F and snowing heavily). 4+ hours in the air (no video, no audio, full plane, no food), and off to the shuttle to the hotel.
Got to the hotel, checked in with no troubles (which pleasantly surprised me, given the troubles I'd had with this property in 2001 and 2003), and managed to get about a third of the way to the room before running into Bob and Ted. Chatted for a bit, then continued on to the room, and got about halfway there before running into Lee. Chatted for a bit and got to the room (right by the elevators), unpacked, and headed off to Registration to get my materials. Couldn't pay for my workshops because the credit-card machine wasn't working. Headed off to the nearby Fashion Valley Mall for an early dinner (having skipped lunch thanks to there being no actual food on the plane) at the Crocodile Grill (yummy porkchop) with Marybeth, Bob, Ted, and Aaron.
After dinner, paid for my workshops and then hung out with folks at Charlie's (the on-property bar) until 11pm or so, before heading off to bed to crash.
Today was my first workshop, Managing System Administrators, ably co-hosted by Cat and Tom. After the usual housekeeping-style announcements, we all mentioned the issues we had and wanted to discuss, then grouped them into similar areas. Our discussions included motivating yourself when you've lost interest in the job; dealing with different cultures and perceived-difficult personalities; integrating new people into the team; and recruiting, hiring, motivation, and retention. After the lunch break, we discussed management resources, hiring and firing, being both a techie and a manager at the same time, politics and relationship management, and finger-pointing. Due to the highly-sensitive nature of the discussions, no additional information is not available outside the workshop attendees.
For dinner, a group of 24 or so of us headed off to Todai for sushi. All you can eat, reasonable quality, and we were well and truly stuffed. At our end of the table we had dkap, Marybeth, and Dan, as well as Moose a couple of seats over and Brent at the far end. Dinner was followed by a large hot tub group, then schmoozing at Charlie's until closing (12m), then to a tete-a-tete with a friend who's had a difficult last few months, and finally to bed around 2am.
Today was my "day off." I slept in, spent most of the day hanging out in the Atlas Lobby foyer chatting with folks, with a lunch out at Emerald (dim sum!) with Amy, Moose, Mike, Doug, Mike, and Tom P. We almost got LOPSA Executive Director Sam Albrecht to join us, but he got stuck dealing with a phone call or three.
Several of us — Robert, Robert, and Mark — went to the maul to Cheesecake Factory for dinner. After dinner (6 huge shrimp scampi with tons of angel-hair pasta, whole roasted garlic cloves, and the lemon butter sauce), I hung out in the hot tub until it closed, then managed to sneak into a private tasting of dark chocolate (Callibeau 70%) and some sherries (a 1971 Pedro Ximinez and an eastern solera blend); both were very yummy.
Had breakfast with Lois, Moose, and Bob.
Tuesday's sessions began with the Advanced Topics Workshop, once again ably hosted by Adam Moskowitz. We started with an overview of the revised moderation software and general housekeeping announcements. We followed that with introductions around the room — in representation, businesses (including consultants) outnumbered universities by about 4 to 1; over the course of the day, the room included 5 LISA program chairs (3 past, 1 present, and 1 known-future, up from 3 last year) and 7 past or present members of the USENIX, SAGE, or LOPSA Boards (up from 6 last year).
[... the rest of the ATW writeup redacted; check my LJ and my web site for details if you care ...]
After the workshop, a small mob of 8 of us — JD and Trey, Mike, Dan, Aaron, Bob A., Ellen, and I — went to Kelly's steakhouse on the hotel grounds for dinner, where I had a yummy bone-in prime rib (mmm, mmmeeeeat). This let us out just in time to head over to the alphabet soup BOF which introduced more new faces, including the very nice Wendy, who wandered in, got confused, stayed around, and later went out to the Cambridge Computing hospitality suite across the courtyard to bring back a selection of ice cream for everyone interested.
After the BOF I went to the LOPSA Hospitality Suite (formerly the lanai party suite), with a brief visit to the hot tub (mmm, hot tub). Some catching up with friends I see once a year at best, including someone trying to get me over to the other side of the pond for SANE 2006, and then to bed.
Had breakfast with Moose, and Bob.
The technical sessions started with the usual announcements and acknowledgements in a slightly unusual format. (Note to future program chairs, doing applause for each person individually wastes the audience's time and makes them more annoyed than is worth, even given how important it is to recognize the volunteers.) We received 54 abstracts, of which 24 were accepted. We had 1179 attendees by the opening session and 1300 by the time walk-ins were included. (No data was available on how many were tutorial-only versus technical session attendees.) We had a conference wiki again this year.
Program Chair David Blank-Edelman presented the best paper awards:
- Best Paper: Alva Couch, Ning Wu, and Hengky Susanto (Tufts University), "Toward a Cost Model for System Administration."
- Best Student papers: "Toward an Automated Vulnerability Comparison of Open Source IMAP Servers" by Chaos Golubitsky (CMU), and "Reducing Downtime Due to System Maiuntenance and Upgrades," Shaya Potter and Jason Nieh (Columbia University).
Doug Hughes presented the other awards. Tom Limoncelli and Christine Hogan Lear won the SAGE and LOPSA outstanding achievement award for their de facto state of the art book, The Practice of System and Network Administration; and Brandon Allbery of CMU won the second annual Chuck Yerkes award.
Qi Lu, VP of Engineering at Yahoo!, gave the keynote address. Unfortunately, the antics of the program chair, who apparently forgot that the chair's job is not to be entertaining but to coordinate, facilitate, moderate, and otherwise to get out of the way, chased me out of the room early with too many too-long music clips (and no apparent credit given to ASCAP/BMI, unlike the other usual music provider at LISA),¹ and the speaker's accent and poor slides chased others out soon thereafter. (Judging from the backchannel, it didn't get interesting until another 20 minutes had gone by.)
After the morning break, I wanted to go to the network black ops talk by Dan Kaminsky. Unfortunately, the speaker did not show up and the talk was cancelled 15 minutes into the time slot.²
After that, I did hallway track with folks until the vendor floor opened at noon. I did a lap of the floor, bought a couple of LOPSA t-shirts, realized I left my bag (and laptop) out with Moose in the hallway, got it, went back, chatted with my friend Roger who's working his company's booth.
Went to lunch with (among others) Greg, Sabrina, Bob, Ted, and Joe M., Joe S., Stephen, and Trey. We wandered over to the food court at the maul; I had some decent-enough fast-food gyros.
After lunch, I attended Tom Limoncelli's talk on what small sites can learn from big sites. He provided a 4-phase plan:
- Acclimation: Who are the players, current emergencies, and project backlog.
- Basic stability: Backups, replace accidents of history technology decisions that hurt reliability, and learn the purchasing process; stop doing things badly; documentation as flat files, labels on equipment, etc. Cooling, Power, Security, Wiring (CPSW) as a mantra.
- Basic IT applications: Helpdesk (RT), Monitoring (Nagios, Cricket), Documentation (TWiki), Remote Control (VNC, KVM, Serial Console Server, IP-KVMM), and Backups (with an automated tape library).
- Clean Up: Corporate projects, finish the last 2% of previously-started projects, writing and enforcing policies, big visiony things (global directory, written policies, job descriptions and raises, etc.)
- Growth: Or outsource it all to India and skip town.
You must manage expectations, from leadership team through users. Be honest yet optimistic. Quantify when you can. Ended with "We're going to get through this, but it isn't going to be pretty, or fun, or inexpensive." Also, write a quick-and-dirty vision statement for the leadership. (Slides will be online on the web site later.)
In the next block I attended Strata Chalup's talk, "Under 200: Applying IS Best Practices to Small Companies." She posits that we want to be in a results-driven proactive mode, not a crisis-driven reactive mode. (IS versus IT: We're providing Information Services — supplying a service — not shiny gadgets (technology). Our goal is to make others be more productive; using IS not IT reminds us of this.)
We provide a mix of services, supporting business priorities (the business side sets the priorities, we support that), and our customers are peers and colleagues. We need to adjust our attitudes; we want to be future-paced, not present-reactive. We need to make things better, fix the problems, not just apply palliatives to symptoms. Also, using role-accounts and staff lists ("webmaster" not "joe," "helpdesk" not "bob, fred, and sally") improves things, especially as you grow and change staff.
Applying things like access control, modularizing and standardizing the environment, proactively managing licenses, building a knowledge base or documentation repository, instituting a change control policy, and generally going to a policy-based view instead of an ad hoc view does NOT magically fix things overnight. Reversing entropy is slow and difficult, and it won't be perfect on day one.
Direct customer focus changes include maintaining a stockroom (with replacement cables, mice, keyboards, media, mousepads, and so on), keeping regular office hours, leveraging the ticket system, and tracking requests. Metrics can be good, as long as you collect the right data. Standards (heartbeat, uptime, disk usage, performance), tickets (number, time length, etc.), and company-specific needs (e.g., bandwidth, services, licenses, checkin/checkout usages, etc.).
Leverage your ticket system. Everything should go into it, including email and answers. Drop-ins and phone calls should be added. Attach documents. Project-based work may be an exception, depending on your real needs. Graphing the metrics help you see things you might miss in raw numeric data. You should have meaningful priorities (not just critical, high, medium, and low) where meaningful is aligned with business priorities (e.g., sales and customer service is probably more important than engineering, other than release dates; finance is probably more important as well); using due dates will help as well. Setting service-level expectations need to be set.
Monitor to establish baselines and make the results available. This helps with the finger-pointing issues and usually enables automated complainers to feed into the ticketing system. Use the reports generated by your monitoring and ticketing systems. Document productivity in customers and on the team. This helps you establish proof for needing more staff, or problem customers.
After the sessions ended I attended the LOPSA Community Meeting (and transcribed the Q&A session for the web site).³ LOPSA President Tom Perrine spoke about what we've done (gone live as an organization 25 days ago, produced an active member-focused website, formed interest-specific mailing lists, established board/membership communication, including a monthly memo to members (since August), scheduled regular IRC chats with the next one set for December 15th), thanked our volunteers, commented on the business plan (which conservatively has LOPSA in the black by 2007, which is even more amazing since our seed budget money from USENIX is not and will not be forthcoming), and thanked our sponsors, volunteers (especially the leadership committee and technical team). They announced the Board vacancy has been filled by Matthew Barr4, and thanked Art Director JD Welch with a Sony PSP.
Tom Perrine next spoke about the future, both from a membership perspective and a professional one, and urged people to get involved, either as members or volunteers or both. He opened the floor for Q&A, and LOPSA Spokesmodel Chris Palmer answered the questions with aplomb. Several people (notably Greg Rose and Brent Chapman) stressed that LOPSA needs financial support to succeed. This is not the time to sit back and wait, or LOPSA programs won't happen. Join, and volunteer if you can, and consider donating if you have any extra money beyond the membership fees. Finally, the meeting closed with urges for people to use the large post-it sheets on the walls to brainstorm thoughts and ideas about the organization: what benefits should we provide, what programs and services, what discounts might members want, and so on. More details are on the LOPSA web site.
After the community meeting, I headed off to dinner at Forever Fondue with Lee, Moose, Mike, Alex, and Gabe. We had yummy fondue (4-course of caesar salad, cheese (swiss and cheddar), meats in beef broth and seafood in vegetable broth, and dark chocolate). Got back to the hotel and swung through the LOPSA Hospitality suite before heading off to bed.
Woke up with delightful sinus problems caused by the very low relative humidity and the head cold. The shower helped, but between that and having to shout at the noisier parties, I expect I'll lose my voice by the weekend. Had breakfast with Moose and Bob again (yay, morning breakfast crew!).
The first session this morning was Brent Chapman's "Incident Command for IT: What We Can learn from the Fire Department." As IT managers we're often called to manage incidents (such as security events, service outages, and infrastructure failures), both planned and unplanned. Places that do emergency management (like fire and police departments and search and rescue teams) organize themselves on the fly, especially with multiple organizations involved; we in IT can learn from how they do this. For example, if a car hits a fire hydrant, who's involved? The fire department to rescue trapped people, one or more ambulance services to treat and transport any injured (in the vehicle or nearby), police to direct traffic and investigate, the water department to shut off the hydrant, and the electric company to shut down the flooded transformer. How this is organized, who's in charge, who figures out who does what in what sequence, and how you do it all without duplicating or wasting effort is the concept of (preplanned) incident response. An IT example is total power failure to a data center where the utility died, the UPS couldn't handle the load, and the generators didn't kick in. (We went through an old GNAC real-life situation as this example.)
The Incident Command System (ICS) is a standardized organizational structure and set of operating principles, as well as tools for command, control, and coordination of an incident response. It lets you coordinate the efforts of multiple parties toward common goals, and uses principles to improve efficiency and response. It started in California and is now a national standard. ICS has nine principles:
- Modular and scalable organization structure: Command (Incident Commander (IC), responsible for overall management, and performs all 4 section-chief roles until delegated); Operations (where the work happens, develops and executes plans to achieve Command's goals, usually the largest number of people (80-90%) involved, and worries about NOW); Logistics (obtains all resources, services, and support, including facilities, transport, supply, equipment maintenance and fueling, food and medical care of the Incident team; more important on big, long-running incidents and may not be needed on small ones); Planning and Status (collects and evaluates info needed to prepare action plan, thinks about the FUTURE, keeps track of what has been and still needs to be done, and keeps current status up to date to let new arrivals brief themselves); and Administration and Finance (tracks incident-related costs, including T&M as needed for reimbursement, and administers procurements that Logistics arranged; usually only activated on the very largest and longest-running incidents). All but Command are as-needed; smaller incidents may let one person be in multiple boxes.
Note that rank in the administrative organization (peon, manager, VP) doesn't necessarily apply to rank in the incident organization (ops peon, Ops Section Chief, IC).
- Maintain a manageable span of control. Each supervisor (node in the tree) should have 5 +/- 2 subordinates; fork tree as needed. Divide as makes sense, be it functional or geographic. Maintains information flow.
- Unity of command. In an incident, every person has exactly one boss, up to the IC, in a strict tree structure. No matrix or dotted-line management in an emergency, since you may not know each other and therefore the strict yet simple structure helps.
- Explicit transfers of responsibility. The organizational tree may change, but the transfers are explicit. A more-senior arrival doesn't immediately take over a position in the tree until an explicit briefing and relief. The Planning/Status team keeps the overall org chart updated.
- Clear communications. Communicate clearly and completely, not in code, to reduce confusion, reduce time spent clarifying, and lets other people monitor things. Also, talk directly to the resources when possible; use the tree to locate resources but talk directly as needed.
- Consolidated action plans. Written is better. Sec Chiefs help develop it, and it's for a given operational period (hour, shift, day, week, whatever). If a plan crosses organizational or specialty boundaries, write it down.
- Management by objective. Tell people WHAT you want accomplished, not HOW. This lets the people closer to the problem do the right solution. Let them be flexible and creative, let them use their skills to solve the problem.
- Comprehensive resource management. All assets and personnel need to be tracked, so new ones can be used effectively and existing ones can be relieved. Folks should sign in through Administration and then wait for assignment to let their knowledge and skills be put to best use. You might have a "report to" site to let them show up and wait in one place, and get quicker new-arrival briefings.
- Designated incident facilites. Designate a single command post (CP) where everybody expects to find the IC. It's also often useful to designate a staging area (cf. #8) for new resources, sign in, assignment, etc.; usually next to the CP.
Tips to get the most out of ICS:
- Establish ICS early in an incident.
- It's a toolbox full of tools — use what you need but keep it simple, don't overplan it ("plans don't survive contact with reality") — who's on vacation, who quit, who's newly-hired, and so on.
- Practice it at every opportunity, including routine and preplanned events (such as moves, upgrades, and deployments).
After the morning break, where I ran out to the vendor floor and grabbed a LOPSA laminated luggage tag, I went to Peyton Engel's talk about "What's the Worst That Could Happen." There are three ingredients that characterizes a vulnerability: A problem or flaw of some kind, the problem involves some kind of change of security state, and it must be possible to trigger the problem. Note that knowledge of the problem or the existence of a fix are not part of the definition. Once you know what the vulnerabilities are you can quantify the risk if they're exploited (total cost, including hardware, software, data loss, loss of face, and so on; and frequency, or how often the exploit will happen).
Went to lunch with Frank at the maul food court, and met up there with Michael, Cat, Carson, and Mark R. I was feeling more and more run down, so I did the hallway track on the couch until the plenary session, where Matt Blaze spoke about the security and reliability of wiretapping.
After the plenary session, I napped briefly before the reception. I wasn't too thrilled with the food (the bruschetta was good, but the garlic breadsticks were rubbery, and the only non-salad side dish for the roast beast were french fries), though it was better than the 2001(?) "cold burgers in the parking garage with a circus theme" attempt they had. I'm also over the whole "gambling with fake money to win raffle tickets for prizes most people don't want. I'm also confused as to why, despite all previous history, they raffled off the grand prize before the smaller prizes, but I'm not directing this production so it's not my problem.
After the reception I changed into my tuxedo and headed up to the combination party; Cat and Tom cohosted their annual birthday party (formal attire requested) with the program chair's usually-on-Friday party in the presidential suite upstairs. I was, of course, the best dressed person there until Greg (in his tux) and Pat (in a lovely dress) arrived. I had to step out for a while to help with the Game Show run-through (we wound up changing the order of answers within a couple of categories and replacing an entire category (the test audience only got one of the five, and only because the answer was written in the picture on the page). Headed back to the party, swung through the room, and then headed back down to the quieter LOPSA party suite for a while (where I got to process two memberships and a sponsorship). When the upstairs party ran out of beer and folks started coming downstairs, I headed off to bed.
This morning, despite the headcold, I woke up at 7. Did the morning ablutions, packed the majority of the Stuff, and headed off to breakfast. Sat with Bob, Tom, and Mike. None of the 9am sessions grabbed me, so I took a hallway track block to catch up on email and my webcomics before the conference wireless network goes down at 2pm. During this time and the break I had 2 cups of hot mint tea5.
After the morning break, I went to Michael Crusoe's talk. He is (in his own words) a recent escapee from the biometrics industry, and he spoke about using biometrics from a sysadmin point of view. It was a high-level overview of the major biometric modalities, or methods of using body parts for identification. Techniques included:
- Facial recognition, which are error-prone in 2 dimensions due to changes in poise and lighting
- Fingerprinting, which can use the actual image, and the minutiae or the changes and breaks in ridges; real-world testing shows that errors, both false positives and false negatives, decrease as the number of fingers examined increases
- Hand geometry readers, which is the largest-deployed technology today
- Iris recognition, which is the most accurate due to the large amount of data available in a small space (striations, positioning, and so on), but which is very expensive to calculate and only one vendor is in this space (with soon-to-expire patents, so this may change)
- Speaker recognition, or voice-response
Other modalities were mentioned, including vein recognition (using the pattern of the veins in the hand) and dynamic signature recognition (specifying the location, pressure, and velocity of the pen). In all cases, there are efforts to ensure the body part is live (either by prompted motion, such as smiling or blinking on cue, or by scanning for temperature or motion).
Afterwards a group of us headed off to lunch at the food court at the maul. I caught up on email before heading off to the talk about silly network tricks. It was basically indicating various silly (and occasionally stupid) UI design flaws in a variety of vendors, so I didn't keep detailed notes. I also had to leave a bit early to help set up the game show.
We had the luxury this year of plenty of time to set up, so we set up from 2:45pm and opened the doors around 3:35pm for a 3:45pm start. Three rounds of four people and the finals round of the three individual-round winners, and we were done by 5:20pm. (We had to be out of the room by 5:30pm for a different group's starting-at-6:00pm party, so that extra ten minutes to do tear-down helped.)
After the tear-down, did some chatting and schmoozing with folks, and wound up getting the LOPSA Suite key from Tom Perrine to go process another new membership. Around 7:45pm (scheduled; in reality, it was more like 8:30pm) headed off to Rei Do Gado downtown for meat on swords. (Mmm, mmmmeeeeeat.) About $1800 for 32 people. Yum. Bacon-wrapped filet, tri-tip steak, sirloin steak, skirt steak, chicken stuffed with cheese, leg of lamb, pork sausage, and BBQ back ribs, plus seafood and prosciutto from the salad bar. Add some chocolate ganache and 'twas very tasty indeed. (It would've been nicer if they'd done at least each of the three tables on individual checks, rather than 3 tables totaling 32 people on one bill then not generating receipts for those of us who needed them. This added at least 30 if not 60 minutes to the end of the evening. I was not happy.)
After finally getting back to the hotel around 11:30pm, I hung out at the LOPSA Hospitality Suite and Party Room for an hour or so, came back upstairs, packed up, set the alarm, called for a morning shuttle to the airport, and crashed.
Today was the travel day to return home. Woke up before the alarm, finished packing, checked out (and had a lower-than-expected room rate), and caught the 7am shuttle to the airport. Got there in plenty of time, got checked in, got to the gate, boarded the plane, and were all set for an on-time departure — and then the captain came on to tell us that on his walk-around he saw a dent in a fan blade in the number-2 engine. Since it's only a 2-engine aircraft and the tolerances are tight enough and there's no spare handy, they asked us to deplane. We did, and the poor gate-agents got the joys of rebooking all 150 or so passengers. I got rebooked onto the 1:58pm flight (yay, 4 more hours at SAN) and they claimed they'd shift the luggage without my having to go out to baggage claim, grab it, and recheck it. (They were right.) Got a $5 meal voucher too (which wasn't enough to cover lunch).
Ran into several people from the conference on the flight, including a nice lady heading back from her first LISA to Michigan State University's Radiology department. She's not a LOPSA member yet, but hopefully she'll check out the web site and join. Ran into several people from the conference wandering the concourse around lunchtime and chatted for a bit. Wrote up bits of the trip report and napped a bit during my enforced break.
Finally got aboard and flew home. Landed on-time (only 5:13 late) around 7:30pm, had my luggage in hand by 8pm, and was home by around 8:30pm or so. Wrote up the last of the trip report, unpacked, had some dinner, and updated the LJ cross-referencing.
|¹||Someone suggested that the chair might be jonesing for the stage since he doesn't have an IT this year, but nevertheless the job of the chair at the conference is not to do a song and dance number but to give the announcements, introduce the paper award winners and other award presenters, and introduce the keynote speaker.|
|²||We later found out that the speaker had scheduled himself for the wrong week. Whoops.|
|³||I'm disappointed they started over ten minutes late, and that two of the Board members could not be there on time: Andrew Hume, who thought it started at 7pm, and Stephen Potter, who was closing down the booth on the vendor floor.|
|4||Matthew came in tenth in the voting for the original 9 Board slots, and was the unanimous choice of both the leadership committee and the Board to fill Geoff Halprin's vacated seat.|
|5||I almost never drink hot tea. Other than with Chinese or Japanese food, I simply don't drink it. Having 2 cups of it in one morning should be a clue.|