2022 Northwest Managers of Educational Technology Conference Summary

This April I attended the Northwest Managers of Educational Technology conference held this year in Coeur d’Alene, Idaho. Since there’s nothing quite like this group in the Southeast, it felt well worth it to me to fly across the country to enjoy a little normalcy and connect in person again with fellow A/V professionals focused on education. Of course, I can’t deny that the location for this year’s event on the shores of beautiful Lake Coeur d’Alene was an added draw. The conference was well attended (I’m guessing ~100 attendees), and exceptionally well run. NMET is a close organization with a history that spans several decades going all the way back to the beginning of the AV industry as we know it in the era of analog media.

Lake Coeur d'Alene

TOPICS

  • Responses to the pandemic and various school’s efforts to work toward a “new normal” 
  • The CARES Act as a catalyst for A/V classroom upgrades: UNLV launched a huge new program during COVID called RebelFlex using CARES funds that is seen as largely successful that would likely not have been possible otherwise. (Duke, along with several other top private universities such as Harvard and Princeton chose not accept CARES act funding.)
  • COVID as a driver for A/V initiatives and standardization: Many schools saw decision-making for A/V and IT-related projects shift to the provost level and higher as schools developed alternative teaching strategies such “emergency”, “HyFlex,” “hybrid,” “co-mingled,” and remote teaching as pandemic responses. In most cases timelines for implementing major A/V projects sped up significantly as well.
  • COVID as a driver for A/V standardization: Oregon State University described how COVID helped their campus standardize on an enterprise A/V strategy that centered on Kaltura, Canvas, and Zoom, and quieted demand for competing tools. Interestingly, OSU does not use a dedicated recording tool such as Panopto but instead utilizes Zoom for all recording and pushes this content to Kaltura within Canvas course sites. 
  • Faculty support models for hybrid teaching: UNLV’s RebelFlex program experimented with hiring students who were assigned to in-person classes as tech support. While overall this seemed successful, there were challenges, such as the diminishment of the need for tech support as the semester went on and faculty became familiar with the new technologies involved. Additionally it was observed that faculty members tended to morph the roles of their student help into roles resembling TAs and research assistants over time, including using these helpers as moderators for their Zoom chats.
  • Building a Networking Group like NMET: Some of the conference attendees were surprised I came all the way from North Carolina to attend the conference. “You mean the Duke?” several asked. I explained there’s nothing in the southeast comparable to NMET, an education-driven organization focused on the intersection of A/V and IT. That’s sad, but not surprising in a way, since a successful organization like NMET isn’t built overnight. NMET began holding conferences in 1979 and is the result of the hard work and passion of several generations of A/V professionals who have comprised NMET.
  • The A/V Superfriends Podcast (https://www.avsuperfriends.com/): Some of the members of NMET together with other A/V professionals extending beyond that group maintain a very cool podcast for A/V professionals focused on the intersection of A/V and pedagogy in higher ed. They were actually recording new episodes of the podcast live in the exhibit area. Members of this group led several interesting conference sessions focused primarily on the impact of COVID for classroom technology. Recent topics of their podcast include: 
    • Managing PO’s and supply chain issues
    • Campus support structures
    • Auto-framing and auto-tracking cameras
    • Cabling infrastructure and TIA standards
    • The intersection of A/V and IT in hiring new staff
    • AV replacement cycles–do we set arbitrary schedules of 5, 7, 10 years or tie AV refresh projects to capital projects?
    • Bootstrapping light video production switchers into classroom systems
  • AV over IP: It was argued by some that the NDI (Network Device Interface) protocol represents the wave of the future, and that we should future-proof our classrooms by purchasing NDI-capable cameras
  • Benefits and drawbacks of Zoom certification: It was discussed this may be OK as long as not mandated or exploited for commercial benefit (cross reference Tandberg)
  • “Hybrid” (instructor-driven) vs. “HyFlex” (student-centered) classrooms
  • USB as the “common language of hybrid learning spaces”
  • Elevating sound quality in the rush to add A/V infrastructure to classrooms 
  • Keeping classroom AV UI’s simple and standard even in classrooms where there is great complexity under the hood
  • ePTZ (auto-tracking) cameras: Importance of good lighting, fixed positions are better than continuous tracking
  • Making a virtual lightboard: One presenter showed how he used Procreate and a green screen in front of presenter to make a virtual lightboard 


VENDORS

  • Kaltura: Kaltura was one of three main sponsors of the conference. As mentioned above, Oregon State University, which was the main organizer of the conference, is a Kaltura customer. It was noted that Kaltura, unlike most other vendors, still offers an unlimited storage and bandwidth licensing tier, although it was mentioned it is “expensive.”
  • Panasonic: Panasonic was another major sponsor of the conference. Their projectors and displays were used in conference venues.
  • Elmo was showcasing its wide array of document cameras from a $200.00 USB to similarly portable wireless options starting at ~$800.00 to its flagship 4K, 12x optical zoom version designed for fixed classroom installations, the PX-30E (MSRP $3700.00). Interestingly, while WolfVision is the 500lb gorilla in the doc cam space, Elmo actually invented the document camera, and is the older company.
  • Epiphan was showcasing its well-known Pearl live encoder lineup along with its cool new device, the LiveScrypt. The LiveScrypt connects to Epiphan Cloud to add live ASR-based captions to your live production. These captions can be embedded with your live streams or sent out to monitors in the room for display for in person or hybrid events. There is a charge of $10.00/ hr to use the cloud-based ASR service in addition to the $1,500.00 cost of the device itself.
  • Alfatron had its wide range of PTZ cameras on display, ranging from a MSRP of $700.00 to $2150.00.
  • Shure had a booth showcasing equipment by Stem, a company they recently acquired. Stem offers complete solutions for outfitting conference and meeting rooms with a range of mics, including tabletop, wall, and ceiling mounted ones, together with a hub and an integrated control system for managing the individual elements.
  • Smart was demoing its latest lineup of interactive displays
  • Legrand AV showcased a wide range of products focused on physical classroom infrastructure, including displays, display mounts, projectors, PTZ cameras, speakers, device controllers, and network switches. Legrand is a large company that owns Vaddio, Chief, Da-Lite, and Middle Atlantic Products.
  • Cleardigital featured its modular display wall called Vue featuring very smooth touch surfaces and replaceable panels as well as other products such as a PTZ cam, the RL400, a portable doc cam and an all-in-one conference camera.
  • Newline Interactive was featuring its newest interactive and non-interactive displays ranging from 27” to 98”
  • AVer gave a conference session demo-ing its new autotracking PTZ camera, the TR333V2. The TR333V2 offers:
    • 30x optical zoom
    • Sophisticated pre-set configuration, including the ability to move in and out of continuous tracking and fixed position mode based on how an instructor moves in the classroom
    • 4k
    • 3G-SDI, HDMI, IP, and USB output 
    • Full or half body tracking

Cisco Visits Duke

This past Friday, Cisco visited the Technology Engagement Center on Duke’s campus to provide an update on their software and hardware offerings. While much of the conversation revolved around “behind the scenes” updates to the platform and general trends (on-prem vs. hybrid installs vs. Cloud), they did mention a range of new AI-based features that may be available in the near future, specifically transcription, translation, and virtual assistant services.

No Cisco conversation would be complete without an overview of their existing and soon-to-be-released hardware. While many of their offerings are unchanged, they do plan to offer a version of the Cisco Webex Room Kit Mini without the codec, for rooms where you simply need BYOD support. If a codec is needed for the room, a simple software key will bring the room up to a full Webex Room Kit Mini.

New AI-Based Transcription Service Otter.ai

Some of Duke’s Communications staff have been experimenting lately with Otter.ai, a new transcription service that offers 600 minutes per month, and seem to be enjoying it. Otter, which was started by an ex Google engineer in early 2019, is an interesting move forward in the captioning and ASR space. It’s focus seems to be less on captioning, such as Rev.com (a widely used service at Duke) and more on live recording of meetings via your browser and making searchable transcripts available in a collaborative, teams-based environment. I had some problems in utilizing Otter to produce a caption file, but it does seem like Otter could be useful for simple transcription workflows, and the idea of using something like Otter to record all your meetings poses some interesting possibilities and questions.

Otter.ai

 

Below is a summary of what I found in my initial testing:

  • High accuracy, comparable to other vendors we’ve tested recently utilizing the newest ASR engines
  • Interesting collaboration feature set
  • Can record your meeting right from within the browser
  • Nice free allotment—600 free mins/ month (6000/month for the pro plan, education pricing $5.00/month)
  • Includes speaker identification
  • If your goal is captions and not just transcriptions, Otter is more limited–only seems to supported export of captions in .srt format (not .vtt, which some of our users, including the Duke Libraries, prefer)
  • The .srt I exported in my test file was was grouped by paragraph, not by line, and so it wouldn’t be possible to use the .srt with one of our video publishing systems like Warpwire or Panopto without extensive editing to chunk the file up by line.

Comparing Machine Transcription Options from Rev and Sonix

As part of our continuing exploration of new options for transcription and captioning, two members of our media production team tested the automated services offered by both Rev and Sonix. We submitted the same audio and video files to each service and compared the results. Overall, both services were surprisingly accurate and easy to use. Sonix, in particular, offers some unique exporting options that could be especially useful to media producers. Below is an outline of our experience and some thoughts on potential uses.

Accuracy

The quality and accuracy of the transcription seemed comparable. Both produced transcripts with about the same number of errors. Though errors occurred at similar rates, they interestingly almost always occurred in different places. All of the transcripts would need cleaning up for official use but would work just fine for editing or review purposes. The slight edge might go to Rev here. It did a noticeably better job at distinguishing and identifying unique speakers, punctuating, and in general (but not always) recognizing names and acronyms.  

Interface

When it came time to share and edit the transcripts, both services offered similar web-based collaborative tools. The tools feature basic word processing functions and allow multiple users to highlight, strikethrough, and attach notes to sections of text. After it’s recent updates, the Rev interface is slightly cleaner and more streamlined. Again, the services are pretty much even in this category.

Export Options

This is where things get interesting. Both services allow users to export transcripts as documents (Microsoft Word, Text File, and, for Sonix, PDF) and captions (SubRip and WebVTT). However, Sonix offers some unique export options. When exporting captions, Rev automatically formats the length and line breaks of the subtitles and produces reliable results. Sonix, on the other hand, provides several options for formatting captions including character length, time duration, number of lines, and whether or not to include speaker names. The downside was that using the default settings for caption exporting in Sonix led to cluttered, clunky results, but the additional options would be useful for those looking for more control of how their captions are displayed.

Sonix also allows two completely different export options. First, users can export audio or video files that include only highlighted sections of the transcript or exclude strikethroughs. Basically, you can produce a very basic audio or video edit by editing the transcript text. It unfortunately does not allow users to move or rearrange sections of media and the edits are all hard cuts so it’s a rather blunt instrument, but it could be useful for rough cuts or those with minimal editing skills.

Sonix also provides the option of exporting XML files that are compatible with Adobe Audition, Adobe Premiere, and Final Cut Pro. When imported into the editing software these work like edit decision lists that automatically cut and label media in a timeline. We tried this with two different audio files intended for a podcast, and it worked great. This has the potential to be useful for more complicated and collaborative post-production workflows, an online equivalent of an old school “paper edit”. Again, the big drawback here is the inability to rearrange the text. It could save time when cutting down raw footage, but a true paper edit would still require editing the transcript with timecode in a word processing program.

And the winner is…

Everyone. Both Rev and Sonix offer viable and cost-effective alternatives to traditional human transcription. Though the obvious compromise in accuracy exists, it is much less severe than you might expect. Official transcripts or captions could be produced with some light editing, and, from a media production perspective, quick and cheap transcripts can be an extremely useful tool in the post-production process. Those looking to try a new service or stick with the one they’re familiar with can be confident that they’re getting the highest quality machine transcription available with either company. As more features get added and improved, like those offered by Sonix, this could become a helpful tool throughout the production process.

New Machine Transcription Option from Rev

We recently posted about some exciting new options in the world of captioning spearheaded by a company called Sonix, which offers a page for account set-up for members of the Duke community that waives monthly subscription charges as part of their edu program. Hot on the heels of that announcement, we learned that Rev.com, who has long offered high quality human-generated transcriptions for Duke, now has their own machine transcription option. It’s a bit more expensive than Sonix at ten cents per minute as opposed to around 8 cents per minute for Sonix. We’re working on a detailed comparison of the two services and will share more info here as we have it.

Rev's New Machine Transcription Option

Rev also just announced improvements to their caption editor. We’d love to have your feedback about these changes as well as about your use of Rev’s new machine transcription option. According to Rev, the improvements to the editor include:

  • Text selection toolbar – keep your timestamp, highlight, strikethrough, and comment tools where you need them, contextually accessible next to the text you just selected.
  • White theme – a light, minimal color scheme to bring the Transcript Editor into the same modern styling as the rest of Rev.com.
  • Streamlined transcript body – no more cluttered columns, all speaker names and timestamps are now in-line with the transcript body, so you can focus on the content that matters to you.

For a full, updated walkthrough of all Transcript Editor functionality, see The Rev Transcript Editor, a Guide for First Time Users.

 

 

Help Us Test Sonix.ai

OIT has been following what’s happening in the evolving world of captioning over the years, and in particular monitoring the field for high quality, affordable services we think would be useful to members of the Duke community. When Rev.com came along, offering guaranteed 99% accurate human-generated captions for a flat $1.00 a minute (whereas some comparable services were well over $3.00/minute), we took note and have facilitated a collaboration with them that has been very productive for Duke. A recent review of our usage shows that a lot of you are using Rev, with a huge uptick in usage over the last couple years, and we’ve heard few if any complaints about the service.

While in general there has been a dismissive attitude toward machine (automatic) transcription, the newest generation technology, based on IBM Watson, has become so good that we can no longer (literally) afford to ignore it. With good quality audio to work from, this speech-to-text engine claims to deliver accuracy as high as 95% or more. IBM Watson isn’t a consumer-facing service, but we’ve been on the lookout for vendors building on this platform, and have found one we feel is worth exploring called Sonix. If cost is a significant factor for you, you might consider giving it a try.

Sonix captioning is a little over 8 cents per minute, and has waived the monthly subscription requirement and offered 30 free minutes of captioning for anyone with a duke.edu email address who sets up their account through this page: https://sonix.ai/academic-program/duke-university.

We are not recommending Sonix at this time, but are interested to hear what your experiences with them are. And we would caution that with any machine transcription technology, a review of your captions via the company’s online editor is required if you want to use this as closed captions (vs just a transcription). In our initial testing Sonix’s online editor looks fairly quick and easy to use.

If you set up an account and try Sonix, please reach out to oit-mt-info@duke.edu to let us know what your experiences are and what specific use cases it supports.

 

New Machine Caption Options Look Interesting

We wrote in April of last year about the impact of new AI and machine learning advances in the video world, and specifically around captioning. A little less than a year later, we’re starting to see the first packaged services being offered that leverage these technologies and make them available to end users. We’ve recently evaluated a couple options that merit a look:

Syncwords

Syncwords offers machine transcriptions/ captions for $0.60/per minute, and $1.35/ minute for human corrected transcriptions. We tested this service recently and the quality was impressive. Only a handful of words needed adjustment on the 5 minute test file we used, and none of them seemed likely to significantly interfere with comprehension. The recording quality of our test file was fairly high (low noise, words clearly audible, enunciated clearly).

Turnaround time for machine transcriptions is about 1/3 of the media run time on average. For human corrected transcriptions, the advertised turnaround time is 3-4 business days, but the company says the average is less than 2 days. Rush human transcription option is $1.95 with a guaranteed turnaround of 2 business days and, according to the company, average delivery within a day.

Syncwords also notes edu and quantity discounts are available for all of these services, so please inquire with them if interested.

Sonix.ai

Sonix is a subscription-based service with three tiers: single-User ($11.25 per month and $6.00 per recorded hour/ $0.10/minute), Multi-User ($16.50 per user/month and $5.00 per recorded hour) , and Enterprise ($49.50 per user/month, pricing available upon request).  You can find information about the differences among the tiers here: https://sonix.ai/pricing

The videos in the folder below show the results of our testing of these two services together with the built in speech-to-text engine currently utilized by Panopto. To be fair, the service currently integrated with Panopto is free with our Panopto license, and for Panopto to license the more current technology would likely increase their and our costs. We do wonder, however, whether it is simply a matter of time before the currently state-of-the art services such as featured here become more of a commodity:

https://oit.capture.duke.edu/Panopto/Pages/Sessions/List.aspx?folderID=4bd18f0c-e33a-4ab7-b2c9-100d4b33a254

 

Rev Adds New Rush Option

Rev.com‘s captioning services have been in wide use at Duke for the last couple years in part because of their affordability (basic captioning is a flat $1.00/minute), the generally high accuracy of the captions, and the overall quality of the user experience Rev offers via its well-designed user interfaces and quality support. Quick turnaround time is another factor Duke users seem to appreciate. While the exact turnaround times Rev promises are based on file length, we’ve found that most caption files are delivered same or next day.

Rev.com

For those of you who need guaranteed rush delivery above and beyond what Rev already offers, the company just announced it now offers an option that promises files in 5 hours or less from order receipt. There is an additional charge of $1.00/minute for this service. To choose this option, simply select the “Rush My Order” option in desktop checkout.

If any of you utilize the new rush service, we’d love to hear how it goes. Additionally, if you have any other feedback about your use of Rev or other caption providers, please feel free to reach out to oit-mt-info@duke.edu.

New Search Feature on Rev.com

While OIT continues to actively explore captioning technologies and vendors as they become available, Rev.com has been the most used captioning vendor at Duke in recent years because of their relatively low cost, and intuitive workflows. Rev just announced a new feature that is likely to be of proportional interest to the amount of captioning you’ve done with Rev, or plan to do. You can now search your body of caption files on the Rev website for individual words and phrases and call up the specific caption files in which those words are included. For example, in the example below, searching for the word “blockchain” on an account pulls up a couple options. You can click on the title of the individual caption files to go to that file within Rev’s online editor.

Rev's Search Feature

Wirecast 10 Adds Live Captions

Wirecast recently announced a new cloud-based service that supports live captions based on ASR (automatic speech recognition) and an rtmp re-streaming service. Both work in conjunction with Wirecast 10. This means that if you are using Wirecast 10, you can automatically caption your videos and simultaneously push them to another provider like YouTube or Facebook live. This is an interesting development because we are seeing the entrance of new ASR platforms like IBM Watson that claim to offer much greater accuracy than has been possible with earlier generation ASR technologies. I’m not sure what platform Wirecast is leveraging, but we’d love to hear from anyone at Duke using Wirecast 10 who is willing to give their 100 minute free trial a go.

New Wirecast Cloud Services

It’s a subscription-based service with monthly fees starting at $25.00/month for re-streaming and $60.00/month for live captions. Detailed information and a link to set up an account and get started can be found here:

https://www.telestream.net/wirecast/webservices/