The short answer was no, it’s all about experience. Here’s a rough longer answer based on my data warehousing experience at 4 different organisations:
- Strong base SAS and macro is essential, this is what the majority of data warehouse jobs use.
- Also important is a strong knowledge of how SAS works internally. What gets written to the log in the following sample, and why?
data _null_;
log 'hi';
set work.test(obs=1);
run; - A general knowledge of SAS and the ability to learn more as needed: e.g. I also use ODS and Graph on a regular basis.
- General knowledge: operating system knowledge (mainframe/Unix/Windows), SQL, reading from different formats (CSV, XML…), writing to various formats (Excel, HTML, XML…), version control, backup…
- Awareness of machine resources and scheduling
- Understand the big picture: can you justify creating an index (space and time vs. potential usage & savings)? Should you create another data set for new data or combine with an existing one? When should you create a macro for a common routine vs. a pre-summarised data set?
- Consistency: users need to be able to transfer their knowledge from one part of the warehouse to another. Don’t make it hard for them.
- Work within the existing warehouse setup: if the naming standards are CamelCase (my preference), don’t create Underscore_Names.
- It’s about doing it right
- Documentation. The ‘d’ word! I try to create clear and commented code, easy-to-use data sets, and document what I’ve done. My goal is to make myself redundant (which hasn’t worked!)
- Communication: essential within your team, to users, and management. Sometimes users and/or managers will have an exact requirement but it should be implemented in a different way to fit the overall picture—you need to spot these and be able to explain why.
- After hours work: users don’t like losing the warehouse to maintenance during working hours!
No comments:
Post a Comment