Tuesday's sessions began with the Advanced Topics Workshop; once again, Adam Moskowitz was our host, moderator, and referee. We started with our usual administrative announcements and the overview of the moderation software for the three new folks. Then, we went around the room and did introductions. In representation, businesses (including consultants) only outnumbered universities by about 2 to 1 (down from 4 to 1); over the course of the day, the room included 6 LISA program chairs (past, present, and future, up from 5 last year) and 9 past or present members of the USENIX, SAGE, or LOPSA Boards (the same as last year).
Our first topic was on cloud computing. We discussed the various takes on it, and one of the clearest issues is one of definition: Technologists and non-technical end-users have different definitions of what it means. Comparisons to grid computing were made; the concensus was that grid is for high performance computing (HPC), cloud computing isn't, and that grid and cloud are solving different problems. When discussing cloud computing you need to determine if your definition includes the server/OS (be it physical or virtual), the applications, or the data itself. Then there's the issue of the data, and who owns it, maintains it, backs it up, is responsible for restores as needed, and deletes it when you're done with it. So far, in general, cloud computing is good in that you and your company can save money on hardware and possibly on support (licensing, maintenance, and staff) costs, but so far we've ignored the security aspect. Contracts are all well and good, but there are legal and regulatory and security issues regarding certain types of data (student records, health records, personally identifying information, access control for research data, and so on) that make it a bad idea for some environments, industries (health and financial), and applications. How do you audit your cloud provider?
Next we did a lightning round of cool new-to-individuals tools or technologies. The most common response was a programming language (Erlang, Python, and Ruby); others were Bugzilla, iPhone, memcache, nfswatch, rrdtool, XMPP for system-to-system messaging, and ZFS. One person mentioned his new Viking 6-burner range.
After our morning break, we resumed with a discussion of file systems. Some are looking for alternatives to NFS that scale better; most seemed to like GPFS, and others mentioned OCFS2 and GFS2. In all cases, you need to look at your requirements to find the one that best suits your needs; for example, OCFS2 didn't scale beyond 7 or 8 nodes in a cluster of virtual machines, but if you only have 3 or 4 it might be sufficient. This segued into a distribution discussion regarding what needs to be local and what can be remote, as well as what needs to be read-write (more expensive) versus read-only. From there we segued into charge-back. Can you charge back to other departments or users the cost of your file services (and indeed other services), and if so, how? Most people are looking at tiered models, such as "dumb SATA is free, if you want RAID or backups it costs more." The problem is that end-users can add cheap disk to their systems and not see the difference between disk (the physical device and its data) and storage (the infrastructure for availability, retention, and recovery). Some folks are charging back what they can even though it's not enough to cover the hardware costs, let alone the staff costs. It was stressed that you have to proportionally reflect your costs or the users will game the system, and you have to be careful not to oversbscribe.
Our next major discussion topic was career paths. Management is still the most common option career path for ever-more senior people. In education, it's pretty much the only option, as you have to become a manager to grow into any CTO/CIO/Dean/Provost roles. In industry there's no well-defined career path; there's junior to intermediate to senior, but then it can tend towards either management or architecture/design. One possibility is "minister without portfolio," where you're known internally as a senior resource and various departments bring you the hard problems for advice if not outright solution, and otherwise you just do what needs doing. Some noted that manager-or-techie may be the wrong view. Leadership is the issue: does your organization provide a way to foster and grow leadership skills? It seems that "architect" is the "leader who isn't a manager" title. In addition to growth, the concept of job satisfaction came up. Some are satisfied more by title, some by compensation (salary or benefits), some by growth, and some by having interesting problems to solve. Where are you on that scale, and can your current organization satisfy you? If not, it may be time to find one that can.
After our lunch break, we had a discussion on automation. We talked about some of the differences between host and network based configuration tools, and how at the base line you need to get a set of consistent configuration files for all the devices at a given point in time. The next problems are to get that configuration information to those devices, then move from that set to another set from a different point in time, in order. Do you keep the configuration data and metadata and state all in the same system, or not? Are the tools topology-aware or not? We should move away from the procedural specification and more towards a declarative mode (such as "build a VPN between point A and point B") and letting the tools figure out the "right" way to do it for your environment. Abstracting up to a declarative level will be helpful in the long run but getting there is going to be challenging. The mental model for automation sits at the intersection of "how people think about their systems" and "what data the tool provides" or "what function the tool performs."
We next had a quick survey on the hot new technologies or big things to worry about in the coming year. Answers included automating failover and self-healing automation; changing the way people think; chargeback and resource allocation; cloud computing; finding a new job, getting out of the rut, having new challenges; getting useful metrics; outsourcing; politics at work; rebuilding community and improving communications between IT and their users.
Our next topic was communications, both between technical and nontechnical (be they business or faculty as relevant) and between groups within an organization. Having an advocate for IT in the remote business group has been helpful for some people; tours of the data centers for non-technical users has helped others. Empowering users to help themselves, such as with self-service web sites or kiosks, helps as well. Getting IT recognized as helping the business accomplish its goals, and not as obstructions or obstacles; IT has to understand those business goals better. It's not that IT should say "No," but rather "Here's a better way to accomplish that" or "Here's what I need to say Yes." Technical people need to remember that just because someone isn't technical doesn't mean they're stupid. One additional note is that, like nurses, we often see people on the worst day of their lives: something is broken, they have a deadline, and we have to fix what's wrong so they can get on with it.
After the afternoon break, we resumed with a discussion on mobility. Laptops and mobile phones are commodities now, so what policies do people have for managing them? There was a reminder that if the policy is too complicated then it'll just be ignored or worked around. Most places have the management of laptops (both corporate and visitor) controlled by now, but handhelds are a newer problem. In general, the policy needs to scope what is and isn't allowed and to focus on what is and isn't within the control of the people enforcing it. This completely ignores the supportability aspects. Are VPNs the answer? DMZs for unauthenticated devices? As with everything else, "it depends."
The security issues involved in managing mobile devices segued into a discussion of identity management; it seems that many people are falling for phishing despite education, outreach, and announcements. Several have implemented email filters to look for personally identifying information in outbound email to try to prevent account compromises.
Security in general is about the same as a year ago (4% said "it's better now"). It's still often an afterthought for infrastructure projects. We tried to brainstorm on how to get people to incorporate security. You need management buy-in and to change the culture, whether it's for regulatory reasons or not. It helps if there are policies to point to and guidelines to follow. Security is a method or a process. It's not a result. It does get better when more people understand and follow it.
Next we discussed our preferred scripting languages. Perl, python, and ruby continue to be the big winners, with shell scripting (and VBscript for those on Windows) trailing behind. Others included and Erlang and Haskell.
We next discussed outsourcing. There's been a rise in many places of the percentage of finance-based managers who don't understand engineering or information technology. Outsourcing is only good in those cookbook situations where there's easily-identified cause and effect, and specific tasks to accomplish. Companies don't think they're giving up control when they outsource. There are two different kinds of outsourcing: Ongoing operations, where automating may be better (but since that's hard, it's next-better to implement it via the API being a request ticket to your outsourcing company), and project-based, where a particular project is given to the outsourcing company. However, it should be pointed out that in India, the average time-in-post is only 3 months, so a 1-year project means you'll have 4 different project teams on average. That gives you lower quality, and going through a non-technical project manager gives you even less control over the implementation. Companies are good at looking at the next quarter but not long term costs. One trick mentioned is to realize that the outsourcing company has limited bandwidth and you can fill it up with less-important projects, showing yourself to be willing and getting goodwill, even though you're keeping the more-important projects in house and use some metrics to show how excellent you are at those in-house projects. Finally, we recommended that you keep your core business in hours; anything else is asking for trouble.
Our last discussion was a lightning round about the biggest technical or work-related surprises this past year. Some of the surprises included company closures despite profit and growth, company relocations, retiring executive management and changes in management or reporting structures, and responsibility changes.
Due to software issues, this year's Talkies Award could not be awarded. Last year's winner was DJ Gregor, but he was not present this year. The Quotable Award went to Andrew Hume with his comment on cloud computing, "They're like KY jelly: They reduce the friction, you get in quickly, and pull out."
[Ed. note: It occurs to me that I've now written up this workshop for ;login: since 1998, making this my twelfth consecutive ATW summary. -jss]