Valid, direct observation of medical student competency in clinical settings remains challenging and limits the opportunity to promote performance-based student advancement. The rationale for direct observation is to ascertain that students have acquired the core clinical competencies needed to care for patients. Too often student observation results in highly variable evaluations which are skewed by factors other than the student's actual performance. Among the barriers to effective direct observation and assessment include the lack of effective tools and strategies for assuring that transparent standards are used for judging clinical competency in authentic clinical settings. We developed a web-based content management system under the name, Just in Time Medicine (JIT), to address many of these issues. The goals of JIT were fourfold: First, to create a self-service interface allowing faculty with average computing skills to author customizable content and criterion-based assessment tools displayable on internet enabled devices, including mobile devices; second, to create an assessment and feedback tool capable of capturing learner progress related to hundreds of clinical skills; third, to enable easy access and utilization of these tools by faculty for learner assessment in authentic clinical settings as a means of just in time faculty development; fourth, to create a permanent record of the trainees' observed skills useful for both learner and program evaluation. From July 2010 through October 2012, we implemented a JIT enabled clinical evaluation exercise (CEX) among 367 third year internal medicine students. Observers (attending physicians and residents) performed CEX assessments using JIT to guide and document their observations, record their time observing and providing feedback to the students, and their overall satisfaction. Inter-rater reliability and validity were assessed with 17 observers who viewed six videotaped student-patient encounters and by measuring the correlation between student CEX scores and their scores on subsequent standardized-patient OSCE exams. A total of 3567 CEXs were completed by 516 observers. The average number of evaluations per student was 9.7 (±1.8 SD) and the average number of CEXs completed per observer was 6.9 (±15.8 SD). Observers spent less than 10 min on 43-50% of the CEXs and 68.6% on feedback sessions. A majority of observers (92%) reported satisfaction with the CEX. Inter-rater reliability was measured at 0.69 among all observers viewing the videotapes and these ratings adequately discriminated competent from non-competent performance. The measured CEX grades correlated with subsequent student performance on an end-of-year OSCE. We conclude that the use of JIT is feasible in capturing discrete clinical performance data with a high degree of user satisfaction. Our embedded checklists had adequate inter-rater reliability and concurrent and predictive validity.