Singh

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

US0094.

95967B2

(12) United States Patent (10) Patent No.: US 9,495,967 B2


Singh et al. (45) Date of Patent: Nov. 15, 2016
(54) COLLABORATIVE AUDIO CONVERSATION (58) Field of Classification Search
ATTESTATION CPC .......... G06F 21/10; G06F 3/165; G 10L 17/02;
G1 OL 17/04: HO4L 2463/101; H04L
(71) Applicant: Intel Corporation, Santa Clara, CA 63/102; H04L 63/12: H04L 65/403
(US) (Continued)
(72) Inventors: Dave Paul Singh, Portland, OR (US); (56) References Cited
payink Fulginiti, Beaverton, OR U.S. PATENT DOCUMENTS
(Continued) 6,477.491 B1 * 1 1/2002 Chandler ................ G1 OL15, 26
TO4/235
(73) Assignee: Intel Corporation, Santa Clara, CA 6,980.953 B1* 12/2005 Kanevsky ............. G06F 12:
US
(US) (Continued)
(*) Notice: Subject to any disclaimer, the term of this FOREIGN PATENT DOCUMENTS
patent is extended or adjusted under 35
U.S.C. 154(b) by 151 days. WO WO-2015026329 A1 2/2015
(21) Appl. No.: 14/124,431 OTHER PUBLICATIONS
(22) PCT Filed: Aug. 20, 2013 “International Application Serial No. PCT/US2013/055789, Inter
national Search Report mailed May 20, 2014'. 3 pgs.
(86). PCT No.: PCT/US2013/055789 (Continued)
S 371 (c)(1), Primary Examiner — Michael Colucci
(2) Date: Dec. 6, 2013 (74) Attorney, Agent, or Firm — Schwegman Lundberg &
(87) PCT Pub. No.: WO2015/026329 Woessner, P.A.
PCT Pub. Date: Feb. 26, 2015 (57) ABSTRACT
O O Disclosed in Some examples are systems, methods, devices,
(65) Prior Publication Data and machine readable mediums which may produce an
US 2015/OO58O17 A1 Feb. 26, 2015 audio recording with included verification from the indi
s viduals in the recording that the recording is accurate. In
(51) Int. Cl. Some examples, the system may also provide rights man
GOL 7/00 (2013.01) agement control to those individuals. This may ensure that
GOL 7/22 (2013.01) individuals participating in audio events that are to be
recorded are assured that their words are not changed, taken
(Continued) out of context, or otherwise altered and that they retain
(52) U.S. Cl. control over the use of their words even after the physical
CPC ............... G10L 17/22 (2013.01); G06F 2 1/10 file has left their control.
(2013.01); G06F 21/6218 (2013.01);
(Continued) 19 Claims, 7 Drawing Sheets
2000
201
0 RECORD WOICEEXEMPLAR

2020-y RECORDEWENTAUDI)

20
RECOGNIZEADTAGADossGMETs

'y SEND INFORMATION FOREACHSEGMENT


TO EAGHTAGGED SPEAKER

RECEIVEVERIFICATION INFORMATION OF
205 ESSENTENNSE
EXAMPLES ASSIGNED DRM FOREACH
SEGMENT
No

20
60 ALL SEGENTSACCOUNTED FOR

YES

207 MASTER RECORDING WITH POLICYTAGs


ISCREATED ANDDISTRIBUTED
US 9,495,967 B2
Page 2

(72) Inventors: Mahendra Tadi Tadikonda, Portland, 8,831,196 B2 * 9/2014 Moyers .................. G06Q 10/10
OR (US); Tobias Kohlenberg, 370,352
Portland, OR (US) 8,931,059 B2 * 1/2015 Moroney ................ G06F 21/10
s T26, 10
9,154,534 B1 * 10/2015 Gayles .................... HO4L 65,60
(51) Int. Cl. 2004/0230797 A1* 11, 2004 Ofek ....................... G06F 21 14
GIOL I5/26 (2006.01) T13,168
G06F2L/10 (2013.01) 2004/026363.6 A1* 12/2004 Cutler ...................... HO4N 7/15
G06F2L/62 (2013.01) 348,211.12
H04L 29/06 (2006.01) 2005, 019851.0 A1*9, 2005 Robert .................... G06F 21/10
713, 175
GOL 9/08 (2013.01) 2007.0245378 A1* 10, 2007 Svendsen ........... HO4N 7/17318
(52) U.S. Cl. T25/46
CPC ............. G10L 15/26 (2013.01); G10L 19/018 2007/0274293 A1 11/2007 Forbes
(2013.01); H04L 63/0861 (2013.01); H04L 2008/01 15224 A1* 5/2008 Jogand-Coulomb ... G06F 21/10
65/1089 (2013.01); H04L 65/403 (2013.01); 726/27
G06F 222 1/2141 (2013.01);
• us
H04L 2463/101 2008/01 15225 A1* 5/2008 Jogand-Coulomb ... G06F726/27
2 1/79
(2013.01) 2009.0043573 A1 2/2009 Weinberg et al.
(58) Field of Classification Search 2010/0095829 A1 4/2010 Edwards ................ G1OH 1,365
USPC .................. 704/249, 235, 239, 246; 84/625; 84f625
726/26, 27; 725/18, 46; 713/168; 2010/0333209 A1* 12/2010 Alve ....................... G06F 21/10
709/231: 707/634; 455/406, 414.1; 2011 0039518 A1* 2, 2011 Mari Hoaria
379/68, 373.01, 201.02: 348/211.12 ala . . . . . . . . . . . . . . . . . . . . . 455,406
See application file for complete search history. 2011/0113011 A1* 5, 2011 Prorock ................. G11B 27.36
TO7/634
(56) References Cited 2011/0145000 A1 6/2011 Hoepken et al.
2011/0288866 A1* 11/2011 Rasmussen ......... HO4L 12, 1831
U.S. PATENT DOCUMENTS 704/246
2012fO232900 A1* 9, 2012 Brummer ................ G1OL 17.02
7,516,078 B2 * 4/2009 Dhawan .............. HO4M 1/7253 TO4,239
379/87 2013, O136242 A1 5, 2013 ROSS et al.
8,081,751 B1* 12/2011 Martin .............. HO4M 3/42O17 2013/0318624 A1* 11/2013 Monsifrot ............... G06F 21/10
379,373.01 T26/26
8,428.227 B2 * 4/2013 Angel ..................... G1OL 15.26
379/68
8,539,543 B2 * 9/2013 Schnell ................... G06F 21 10 OTHER PUBLICATIONS
8,660.539 B2 2/2014 Khambete ......... G06F 1736 “International Application Serial No. PCT/US2013/055789, Written
s -w 370,310.2 Opinion mailed May 20, 2014'. 6 pgs.
8,826,316 B2 * 9/2014 Jain ......................... G06F 3,061
TO4,500 * cited by examiner
U.S. Patent Nov. 15, 2016 Sheet 1 of 7 US 9,495,967 B2

BACKEND PROCESSING

VERIFICATION
DEVICE

PLAYBACKDEVICE

CAPTURE DEVICE

FIG. I.
U.S. Patent Nov. 15, 2016 Sheet 2 of 7 US 9,495,967 B2

2000 N.
2010
RECORD VOICE EXEMPLAR

2020
RECORD EVENT AUDIO

2030
RECOGNIZE AND TAGAUDIO SEGMENTS

2040 SEND INFORMATION FOREACHSEGMENT


TO EACH TAGGED SPEAKER

2050 RECEIVE VERIFICATION INFORMATION OF


TAGGED SEGMENTS AND IN SOME
EXAMPLES ASSIGNED DRM FOREACH
SEGMENT

2060
ALL SEGMENTS ACCOUNTED FORT

YES

2070 MASTER RECORDING WITH POLICY TAGS


IS CREATED AND DISTRIBUTED

FIG. 2
U.S. Patent Nov. 15, 2016 Sheet 3 of 7 US 9,495,967 B2

3000 N.

REPEAT
PERIODICALLY
FOR TIME INDEX
DETERMINE ACTIVE SPEAKER FOR TIME N+P
INDEXN

COMPARE LAST KNOWN ACTIVE SPEAKER


TO DETERMINED ACTIVE SPEAKER

- -
- -
u ARE THE ACTIVE SPEAKER
- AND LASTKNOWN ACTIVE
SPEAKERDIFFERENT -
Y- -

3O3O DEFINE NEW SEGMENT, TAG NEW


SEGMENT WITH IDENTIFIED ACTIVE
SPEAKER

FIG. 3
U.S. Patent Nov. 15, 2016 Sheet 4 of 7 US 9,495,967 B2

-4000

RECEIVE SEGMENT INFORMATION

PRESENT SEGMENT INFORMATION,


VERIFICATION OPTIONS, DRM OPTIONS

RECEIVE VERIFICATIONDECISION AND


DRM DECISION

SEND VERIFICATIONDECISION AND DRM


DECISION

FIG. 4
U.S. Patent Nov. 15, 2016 Sheet S of 7 US 9,495,967 B2

-5000
RECEIVE SELECTION OF AUDIO FILE

RECEIVE AN ACTION SELECTION

FOREACH PARTICULAR SEGMENT,


DETERMINE FORMALLOWS THAT
PARTICULARACTION

REPEAT
FOREACH
INVOLVED
SEGMENT

N s u?
YES l
PERFORMACTION ON THAT PARTICULAR
SEGMENT

DO NOT PERFORMACTION ON THAT


PARTICULAR SEGMENT

FIG. 5
U.S. Patent Nov. 15, 2016 Sheet 6 of 7 US 9,495,967 B2

6O1 O 6060
- 6040
CAPTURE DEVICE

2 VOICE
6O20 AUDIO CAPTURE RECOGNITION
MODULE MODULE

DIGITAL RIGHTS
6O3O MANAGEMENT INPUT AND OUTPUT REVIEW MODULE
MODULE MODULE

608O PLAYBACK MODULE CONTROL MODULE

N 6150 ENDUSE COMPUTING DEVICE


611C
INPUT AND
OUTPUT PLAYBACK
MODULE
NETWORK (E.G., INTERNET) MODULE

DIGITAL
RIGHTS
17 MANAGEM
6170 ENT
MODULE

61OO
VERIFICATION COMPUTING DEVICE
6115
INPUT AND OUTPUT 6130
MODULE REVIEW MODULE

612O DIGITAL RIGHTS 6140


MANAGEMENT PLAYBACK MODULE
MODULE

FIG. 6
o
U.S. Patent Nov. 15, 2016 Sheet 7 Of 7 US 9,495,967 B2

u-st / 7OOO

VIDEO 7010
DISPLAY

ALPHA-NUMERIC
INPUT DEVICE

INTER
UNAVIGATION
DEVICE

NETWORK
INTERFACE
DEVICE

SIGNAL
GENERATION
DEVICE
US 9,495,967 B2
1. 2
COLLABORATIVE AUDIO CONVERSATION In addition to losing control over the distribution of the file,
ATTESTATION since the file is not protected, the contents of the file may be
tampered with by using audio editing Software to change the
PRIORITY APPLICATION words spoken, make it seem as if the words were spoken by
others, or to change the context of a given quote. These
This application is a U.S. National Stage Application problems can make individuals apprehensive about being
under 35 U.S.C. 371 from International Application No. recorded, and can make the use of audio as evidence less
than ideal in tribunals and other venues where standards of
PCT/US2013/055789, filed Aug. 20, 2013, which is hereby
incorporated by reference in its entirety. custody and control are desired.
10 Disclosed in some examples are systems, methods,
TECHNICAL FIELD devices, and machine readable mediums which may produce
an audio recording with included verification from the
Embodiments pertain to audio recording. In particular, individuals in the recording that the recording is accurate. In
Some embodiments pertain to audio verification and control. Some examples, the system may also provide rights man
15 agement control to those individuals to prevent unauthorized
BACKGROUND uses of their audio. Such as unauthorized modifications. This
may ensure that individuals participating in audio events that
Audio conversations may be recorded by a number of are to be recorded are assured that their words are not
audio capture technologies. For example, computing devices changed, taken out of context, or otherwise altered without
may capture audio using an on-board or connected micro permission and that they retain control over the use of their
phone and store it digitally in flash memory or other storage. words even after the physical file has left their control.
Example computing devices include a personal digital In some examples, this may be accomplished by deter
recorder, a laptop, a desktop, a cellphone, a portable music mining a plurality of segments of the audio recording based
player (e.g., an iPodTM), or the like. The digital audio files upon an identification of one or more active speakers that
created by these devices may be accessed by users after the 25 were speaking during that segment. Each audio segment
recording is complete. may be presented to the active speaker or speakers that were
identified as speaking in that segment for verification. The
BRIEF DESCRIPTION OF THE DRAWINGS verification asks the speaker to affirm that the words cap
tured represent the words spoken by that speaker. The
In the drawings, which are not necessarily drawn to scale, 30 identified active speaker may also setup rights management
like numerals may describe similar components in different controls (Digital Rights Management (DRM)) to control the
views. Like numerals having different letter suffixes may dissemination for each segment. Each segment may have
represent different instances of similar components. The different DRM applied.
drawings illustrate generally, by way of example, but not by This process ensures that the individuals identified as
way of limitation, various embodiments discussed in the 35 speaking in each segment have authenticated that what is
present document. captured on the segment is a true and accurate recording of
FIG. 1 is a high level Schematic of a system according to their speech, ensures that the words cannot be altered, and
Some examples of the present disclosure. ensures that distribution and other rights may be effectively
FIG. 2 is a flowchart of a method according to some controlled. By breaking the audio into particular segments
examples of the present disclosure. 40 based upon an active speaker, the control of the file may be
FIG. 3 is a flowchart of a method of recognizing audio distributed amongst all of the participants in the recorded
segments according to some examples of the present dis audio event. This ensures that people may speak freely with
closure. the knowledge that they are in ultimate control of the use of
FIG. 4 is a flowchart of a method of verifying and their words.
applying DRM to a segment according to some examples of 45 Audio events may be any event at which audio is capable
the present disclosure. of being captured. An audio event may be a meeting, a
FIG. 5 is a flowchart of a method of an application lecture, a conference, a teleconference, an internet meeting,
utilizing a protected distributable file according to some a concert, a performance, legal testimony, a play, or the like.
examples of the present disclosure. The audio recording of that event may be just audio, or may
FIG. 6 is a schematic of a system according to some 50 be the audio track(s) of a video recording. A segment may
examples of the present disclosure. be defined as any period of continuous speech by a speaker
FIG. 7 is a block diagram illustrating an example of a or group of speakers in the audio recording. For example, in
machine upon which one or more embodiments may be a simple case, the speech segment may begin when a new
implemented. speaker begins talking and end when the speaker ends
55 talking or when another speaker begins talking. In more
DETAILED DESCRIPTION complex cases, multiple individuals may be talking at the
same time. In these examples, several approaches to defining
The digital audio files created by digital audio recording a segment may be employed. For example, there may be
devices are not generally Subject to any modification or multiple time-overlapping segments. Thus if person A and
access controls other than a physical access control which 60 person B are both talking, then a first segment would be the
resides with individuals having access to the audio files. For speech of person A and a second segment may be the speech
example, the owner of the digital audio recording device of person B. The beginning and ending time indices of these
may control the distribution and use of the audio by pre segments may overlap. This approach may be employed
venting others from accessing the audio files. This control where the audio capture is of a nature in which the sound
may be easily lost once the file is distributed to others as the 65 processing equipment and/or Software may distinguish
digital recording may then be redistributed quickly over between multiple active speakers. In other examples, a
email, file transfer protocol (FTP), torrent sites, or the like. single segment may be used which may be attributable to
US 9,495,967 B2
3 4
multiple speakers. In these examples, various rules may be automatically by the system. In other examples, even veri
employed to determine which controls may be exercised by fied sections may be chosen to be redacted by speakers.
which speakers in the segment. For example, all speakers The speaker(s) may also tag each segment with certain
may need to authenticate the segment, and all speakers may DRM restrictions to control the presentation and use of each
add DRM rights to the segment. audio segment. In some examples, the distributable file
As already noted, the system may break the audio event created by the system may include DRM preventing modi
into segments based upon one or more identified active fication of the audio contents of the file by default. This
speakers. Active speakers are individuals who are speaking ensures that once the audio is verified by the constituent
during a particular point in the audio. In some examples, speakers, it cannot be altered. In some examples, the dis
after the audio event, speakers may be recognized manually 10 tributable file may be altered but the speaker verifications
by one or more individuals who may tag the audio recording may be removed—which may signal that it has been modi
with information on which speaker is talking at particular fied. Once all speakers have verified their respective con
points. In yet other examples, recognition of active speakers tributions and specified any desired DRM, the recording
may be done automatically during the recording (on the fly) system may create a master file which may include the
or automatically during post processing of the audio record 15 verification information (e.g., which segments are verified)
ing of the audio event. The system may use various speaker and the DRM restrictions. Individuals who would like to
recognition algorithms to determine one or more active make use of the master recording may do so subject to the
speakers. Prior to recording the meeting or other audio DRM restrictions.
event, the individuals who are present may identify them This process may be facilitated through the use of a
selves and Submit a short speech exemplar. The system may recording device. The recording device may automatically
then use the exemplars to create unique voice prints for each perform one or more of the steps described above. For
individual. As the audio capture event progresses, or once example, the recording device may perform one or more of
the audio capture event has completed, various segments of recording the Voice exemplars, identifying the segments,
the audio may be identified based upon a comparison of the tagging the audio segments, sending the information for
recorded audio with the Voice prints using one or more 25 each segment to each tagged speaker for verification and
speaker recognition algorithms. The identified segments DRM application, and receiving the verification and DRM
may be automatically tagged with the identities of the restrictions. The recording device may then create a distrib
recognized speakers. utable file with the authentication information and the DRM
After the audio event is complete, each tagged segment restrictions. In some examples, the recording device may be
may be submitted to the recognized speakers identified as 30 a personal digital recorder, a computing device (such as a
speaking in that segment for verification that the tagged desktop computer, laptop computer, tablet computer, Smart
segment is a true and correct recording of the words spoken phone), or the like. In some examples, the recording device
by those speakers. This verification may happen on the audio may perform some of the aforementioned functionality and
capture device, or it may happen on various general purpose one or more other computing devices may perform the rest.
computing devices owned or used by the identified speakers. 35 In yet other examples, some or all of the aforementioned
The information sent to each recognized speaker may functionality may be performed by a cloud based service,
include an audio clip which may be the recorded audio of the Such as those associated with a conference call service in
segment, a transcript of the segment, or other information which multiple users call in to have a teleconference.
about the segment. Example conference call services may include GoToMeet
In some examples, the verification may be spoken and the 40 ing R) from Citrix Online, LLC, Uberconference(R) by Fire
verification process may compare the previously captured spotter, Inc., and the like. These services may offer a phone
voice exemplar or voice print with the spoken verification to bridge between multiple users and may include a feature to
ensure that the verification is being made by the person who record conference calls. These services may incorporate the
is speaking in the segment. In addition to, or instead of features of the present disclosure in their computing systems
spoken verifications, other types of biometric security may 45 to provide for verification and DRM of the meeting. While
also be used to enhance this process. For example, the Some of the operations may be performed by one or more
verification process may capture a fingerprint or other bio computing or recording devices, other portions of the opera
metric property from a user prior to recording the audio tions may be performed by other computing devices in a
event during the capture of the Voice exemplars. During the distributed fashion.
verification process, this biometric property may be col 50 Turning now to FIG. 1, an example high level Schematic
lected again and compared with the previously collected of a system 1000 according to some examples of the present
biometric property to ensure the verification is authentic. disclosure is shown. Capture device 1010 may capture the
The biometric information may be sent as part of the audio of the audio event. Among other components, capture
segment information to a computing system of the speaker device 1010 may have a microphone and a processor which
and the computing system of the speaker may verify that the 55 may perform one or more of capturing the Voice exemplars,
biometric matches. In other examples, the biometric infor extracting an audio print from the Voice exemplars, record
mation may be included as part of the verification response ing the audio event, identifying the audio segments, identi
and the capture system may verify the biometric data. fying active speakers in each segment, creating the distrib
The speaker's verification information may be tagged to utable audio file. In some examples, capture device 1010
the distributable audio file to indicate their approval. Seg 60 may also handle verifications and DRM selection. In other
ments not approved or conditionally approved may be left in examples, capture device 1010 may send information about
the file (and the lack of Verification serving as an implicit each segment to one or more verification devices 1020 for
signal of disapproval), may be tagged with their conditional verification by the identified speakers and DRM addition.
approval or disproval (serving as an explicit signal of For example, verification devices 1020 may be computing
disapproval. In some examples, a user may redact portions 65 devices owned by one of the identified speakers and which
of the audio (and any transcript created of that audio). For may be addressable by electronic contact information given
example, a portion that is not verified may be redacted the system by the identified speaker. In yet other examples,
US 9,495,967 B2
5 6
capture device 1010 may handle some verifications and speakers. This operation may be done after the audio event
DRM selections and verification devices 1020 may handle is complete or as the audio is being recorded.
others. The capture device 1010 may receive the responses FIG.3 shows a flowchart of a method 3000 of recognizing
from the verification devices 1020 and may create the final audio segments according to some examples of the present
distributable audio file. In other examples, a back end 5 disclosure. As the audio is being recorded or being processed
processing device 1040 may perform one or more of the (if segmentation happens after the audio event is concluded),
steps performed by the capture device 1010 or verification an active speaker may be determined based upon a com
devices 1020, such as identification of segments and/or parison between the Voice currently speaking at the particu
distribution to individuals. For example, the capture device lar examined time index N and the voice prints created from
1010 may record the voice exemplars and the audio event 10 the speech exemplars at operation 3010. Various speaker
and send the audio file to the back end processing device recognition algorithms may be used Such as frequency
1040 for processing (e.g., identifying the voices, creating the estimation, hidden Markov models, Gaussian mixture mod
segments, and handling verification and DRM, and creating els, pattern matching algorithms, neural networks, matrix
representation, vector quantization, decision trees, or other
the distributable audio file). A playback device 1030 may 15 algorithms. Once an identity of the speaker has been deter
playback the distributable audio file subject to any DRM mined, at operation 3020 the determined active speaker is
restrictions on the distributable file. For example, the audio compared to a last known speaker to determine if a change
file may be in a proprietary format and/or encrypted as a in speaker has occurred. If the active speaker and the last
result of the DRM applied to it. This format and encryption known active speaker are different, a new segment is defined
may be capable of being played by only certain applications at operation 3030 and the new segment is tagged with the
that are trusted to enforce the appropriate DRM restrictions. active speaker identified in operation 3010. If the active
In some examples, one or more of the verification device speaker is the same as the last known active speaker, the
1020, playback device 1030, back end processing device current segment is continued. In some examples, this pro
1040, and capture device 1010 may be the same device. cess may be repeated periodically at a particular sampling
Turning now to FIG. 2, a flowchart of a method 2000 25 frequency P So as to capture change in speaker events (in
according to Some examples is shown. At operation 2010, order to generate new segments). In other examples, the
prior to the start of the audio event, those participants who method of FIG. 3 may be triggered by continuously moni
are present, or who anticipate speaking may submit a voice toring for aural clues that the speaker has changed (e.g.,
exemplar. The Voice exemplar may be a word, phrase, monitoring for changes in pitch, Volume, frequency, or the
sentence, or passage which may be predetermined and which 30 like).
may be selected so as to record certain distinguishing The method 3000 of FIG. 3 may also be employed in
sounds. These voice exemplars may be utilized to extract a situations in which multiple speakers may speak at the same
number of voice related features called a voice print. The time, or nearly the same time. In those scenarios, the active
Voice print may then be used to identify active speakers speaker determined in operation 3010 may be multiple
during the audio recording. A voice print comprises any 35 active speakers, and the comparison at operation 3020 may
information that can be used to distinguish a person’s speech be a comparison to determine if a different group of active
from that of another person. For example, it may comprise speakers are speaking. For example, if at time index N. Bill
one or more distinctive patterns of speech characteristics. and Jill are speaking, and then at time index N--P Bill, Jill,
Example characteristics include frequency or pitch, speed, and Chris are speaking, then because the group of active
word pronunciation, dialect, or the like. For example, the 40 speakers of Bill and Jill is different than the active speaker
individual may input an identifier (e.g., their name) and read group of Bill, Jill, and Chris, a new segment may be created.
a verbal passage or phrase. In some examples, the individu In some examples, segments may be a minimum length.
als may also give electronic contact information (e.g., an This may be created by setting P to a minimum value (e.g.,
email address, an Internet Protocol (IP) address, or the like), 3 seconds). In some examples, the system may sample the
which may be used by the system to automatically send 45 segments every P seconds, but upon finding a change in
segments for verification and DRM selection to recognized segment, may adjust the segment to capture the exact point
speakers. The identifier and contact information may be at which the active speaker (or group of active speakers)
provided orally (e.g., recorded by the system and then, changes. For example, the system may "rewind the audio to
through speech recognition algorithms translated into com determine the exact moment where the active speaker or
puter readable data) or through an input mechanism Such as 50 group of active speakers changed. This may prevent the
a keyboard. This process may continue until all the indi segment from starting in the middle of someone's speech.
viduals who are to speak have provided exemplars. Turning back to FIG. 2, once the audio segments are
While in some examples, the exemplars are given before recognized and tagged, information for each of the audio
the audio event, in other examples, the system may have a segments may be sent to each identified speaker or group of
setup process in which users may pre-record their voice 55 speakers at operation 2040. The information for the seg
exemplars (e.g., voice exemplars). The system may then ments may provide information to the identified speaker(s)
store a library of voice exemplars and use the library to to assist them in Verifying the segment. Example informa
determine active speakers. In other examples, prior to the tion on the segments include one or more of all of, or
audio event, the meeting participants may supply the system portions of the audio of the segment or the recording as a
their credentials (created when they completed the setup 60 whole; an automatically generated transcript of the audio of
process) and the system may speed up processing by that segment or the recording as a whole that are generated
prefetching the Voice exemplars from a database (e.g., automatically based upon speech recognition algorithms;
onboard storage, remote storage accessible by a network, or information on identified speakers; meta data regarding the
the like). segment or audio as a whole Such as segment length,
At operation 2020, the audio event recording begins. At 65 segment position in the audio event; or any other informa
operation 2030, the recording system recognizes and tags tion about the segment or the audio as a whole. In some
audio segments with information on the identities of active examples, in order to provide additional context to speakers
US 9,495,967 B2
7 8
when Verifying segments, the system may provide a certain verification and DRM process. In some examples, this
amount of segment information for segments just before and segment information may be received at a separate comput
after the segment of interest. ing device from the device used to record the audio event.
The identified speakers may then decide whether or not For example, the segment information may be received from
the segment is to be verified or not, and whether or not to a capture device 1010 or a back end processing device 1040
include DRM restrictions. The identified speakers may uti from FIG.1. In other examples, the capture device may also
lize the information provided to them by the system. The perform the verification and DRM tagging. In these
system may then receive their approval, conditional examples, the segment information may be received from a
approval, or denial and the choice of DRM for the segment. separate module of the capture device.
An approval indicates that the segment contains an accurate 10 As previously explained, the segment information may
portrayal of the individual’s speech during the segment. A include audio of the segment, a transcript of the segment,
conditional approval is one in which some parts of the meta data about the segment (e.g., size in bytes, length,
segment are accurate and other parts are not. A conditional position in the audio event, time recorded, date, or the like),
approval may specify which parts of a segment are approved information on identified active speakers, or the like. At
and which are not. A denial is a condition in which the 15 operation 4020, segment information may be presented to an
segment is not verified. The segment may then be tagged active speaker. For example, the audio file may be played,
with this indication. The segment may also be tagged by the the transcript displayed, the meta data presented, and the
DRM chosen by the individual. If multiple individuals are like. Additionally, options for verification and for applica
identified as active speakers in a segment, each speaker's tion of DRM may be shown.
verification, conditional verification, or denial is added to The user may then determine whether or not to verify the
the segment. If multiple individuals submit DRM, each segment and what, if any, DRM to apply to that segment.
DRM decision is also added to the segment. The verification and DRM process may receive the decision
Example DRM restrictions include restrictions on copy of the user and the DRM selections at operation 4030. Once
ing, accessing, modifying, distributing, transcribing (e.g., the decisions have been made, the verification information
restrictions on any digital copy of the text translation of the 25 and DRM may be sent back to the source of the segment
audio) or deleting the segment. In some examples, the DRM information (e.g., the capture device, the back end server, or
may prohibit anyone from performing these activities, but in another process or module) at operation 4040.
other examples, the DRM may prohibit or allow only certain FIG. 5 shows a flowchart of a method 5000 of an
users (or groups of users) from performing these acts. In yet application utilizing (e.g., playing, editing) a protected dis
other examples, the DRM may prohibit certain users (or 30 tributable file according to some examples of the present
groups of users) from performing these acts unless permis disclosure. At operation 5010, the audio file of interest may
sion is obtained from the identified speaker placing the be selected and the choice may be received by the applica
DRM restriction on the segment. In examples in which tion. At operation 5020, a desired action may be chosen and
multiple individuals place DRM on the same segment, any the selection may be received by the application. For
usage of the segment, such as playing back the segment, may 35 example, the user of the application may desire to play the
require that the user satisfy all of the DRM restrictions audio file. In other examples, other actions may include
placed on the segment by all of the identified speakers. In modifying the file, modifying the audio, viewing the veri
Some examples, the system may only play back tracks (e.g., fication information, viewing the segment information, or
voices) associated with DRM policies that are satisfied. For the like. For each segment in the audio file to which the
example, if three people are talking in a segment and the 40 action relates, the application determines whether the DRM
DRM policy is only satisfied for two of the speakers, then conditions associated with that segment are met based upon
only those two are played back (the other person is muted or the action selected, the permissions of the user of the
bleeped out). application, and the DRM tagged to the segment at operation
At operation 2050, the system receives the verification 5030. If at 5040 the DRM conditions are satisfied, the action
and the DRM restrictions for the segments. At operation 45 is performed. For example, if the user has permission to play
2060, the system checks to determine whether all the seg the audio, and the action is to play the audio, then the audio
ments are verified. If not all the segments are accounted for, of the segment is played. If the DRM conditions were not
the system may send a reminder to identified speakers who satisfied, then the action is not performed at operation 5050.
have not submitted all of the segments. If a predetermined Operations 5030-5050 may be repeated for each segment
time period passes and all speakers of all segments have not 50 that is the subject of the action selection at operation 5020.
been accounted for, the system may take appropriate action. For example, if the user wants to modify two segments of the
For example, the system may not attach any verification audio recording, the operations of 5030-5050 would be
information to that segment and may attach a default DRM repeated for each segment. Thus a user may have permis
for unacknowledged segments. In other examples, the sys sions to play or modify only certain segments, but not others.
tem may not necessarily need to account for verification to 55 In other examples, the application may only allow the action
allow dissemination of parts that have been approved. For if the DRM conditions are satisfied for the entire audio file.
example, parts not approved may be redacted until they are Thus if the user has permission to listen to only some of the
approved. segments, but not all the segments, none of the segments
At operation 2070, once the segments are all accounted may be played. In other examples, if less than all of the
for (or the time has elapsed on unaccounted for segments), 60 DRM conditions are satisfied the action may be partially
a master recording may be created which may include the performed. For example, if the DRM conditions for two of
various speaker tags, verification tags, and DRM restric the three speakers in the segment have been satisfied
tions. FIG. 6 shows a more detailed schematic of an example
Turning now to FIG. 4, a flowchart of a method 4000 of system 6000 according to some examples of the present
verifying and applying DRM to a segment according to 65 disclosure. The capture device 6010 (e.g., capture device
Some examples of the present disclosure is shown. At 1010 from FIG. 1) may include an audio capture module
operation 4010, segment information is received at the 6020 which may capture voice exemplars as well as record
US 9,495,967 B2
9 10
the audio event. In some examples audio capture module output module 6115. The input and output module 6115 may
6020 may record the identifications and contact information then get the user's verification status (verified, not verified,
of all the speakers as well. The output of the capture device partially verified) and any DRM that the user wishes to apply
6010 may be stored in storage 6060. Storage 6060 may be to the segment. The review module 6130 may then send this
any local or remote storage such as flash memory, random information via the input and output module 6115 through
access memory (RAM), a hard drive, a solid state drive the network 6110.
(SSD), optical, magnetic, tape, or other storage device. In In addition, playback module 6140 may playback one or
Some examples, storage 6060 may be on a separate device more segments of the audio file if the DRM conditions of the
and the audio information may be sent by the input and audio file are satisfied. The playback module 6140 may
output module 6050 to the remote storage. 10 utilize DRM module 6120 to decode the audio and deter
Capture device 6010 may also include a control module mine DRM compliance. In some examples, the modules of
6090 which may control the process including: controlling verification computing device 6100 may perform the same
audio capture; determining segments based upon analysis of or similar functions as their counterparts on capture device
the audio done by the voice recognition module 6040: 6010.
providing a user interface through input and output module 15 End use computing device 6150 may utilize the distrib
(which may control one or more displays and input devices); utable audio file. For example the end use computing device
creating the final distributable audio file; in Some examples, 6150 may play the audio file, edit the audio file, redistribute
coordinating any review for verification and DRM applica the audio file, and the like. Input and output module 6160
tion on or off the device 6010 by utilizing review module may communicate with verification computing device 6100
6070, playback module 6080, and DRM module 6030 or and/or capture device 6010 over network 6110. For example,
input and output module 6050; and the like. end use computing device 6150 may receive the distribut
Playback module 6080 may play back audio stored on able audio file from the capture device 6010. Playback
storage 6060. In some examples this may be for verification module 6180 may play the audio, edit the audio file, redis
and for adding DRM by identified speakers. In other tribute the audio file and the like, subject to DRM restric
examples, the device may play the audio file for a user of the 25 tions. Playback module 6180 may utilize DRM module 6170
device. In these examples, the device utilizes DRM module which may ensure that the end use computing device 6150
6030 to unlock the audio file or portions of the audio file for (and in some examples, the user of the end use computing
playback. DRM module 6030 may set access rights (in the device 6150) has appropriate permission to utilize the audio
case of the verification and adding DRM on the device), file in the desired manner.
verify access rights, and in some examples, depending on 30 Network 6110 may be or include portions of one or more
the audio format, may unprotect the audio in memory in of a Local Area Network (LAN), a Wide Area Network
order to allow playback module to utilize the audio for (WAN), the Internet, a cellular network (such as a 3G
playback if the device meets the access restrictions on the wireless network or a 4G wireless network), or the like.
audio file. The logical organization of functionality shown in FIG. 6
Input and output module 6050 may communicate with 35 may be re-arranged without departing from the scope of the
one or more other computing devices over network 6110 and present disclosure. Thus the functionality of one or more of
may provide one or more user interfaces on device 6010 at the modules of capture device 6010, verification computing
the direction of control module 6090. Input and output device 6100 and end use computing device 6150 may be
module 6050 may send the distributable audio file, infor implemented on any of capture device 6010, verification
mation on the segments to the identified speakers for veri 40 computing device 6100, or end use computing device 6150.
fication and DRM tagging, may receive the verification Additionally, one or more of capture device 6010, verifica
results including the DRM tags, receive user input and the tion computing device 6100, and end use computing device
like. 6150 may be combined into one or more physical devices,
Voice recognition module 6040 may analyze the voice or split among several devices.
exemplars to generate the Voice prints and may determine an 45 Example use cases may include police interrogations,
active speaker or speakers at a given point in the audio based depositions, interviews, corporate meetings, life blogging,
on the analyzed voice prints. Review module 6070 may recording conference calls, arbitration, mediation, court
coordinate with the control module 6090, playback module room recordings (e.g., as an alternative to expensive court
6080, DRM module 6030, input and output module 6050, transcriptions), legal statement taking and testimony, or the
and storage 6060 to display, play, or otherwise present the 50 like. In some examples in which portions of the disclosure
segment information to one or more identified speakers and are performed outside the capture device, those portions may
may accept input regarding a verification status of the be performed in a trusted execution space in order to create
segment with respect to the identified speaker(s). Review higher confidence in the security provided. In some
module 6070 and/or control module 6090 may then tag the examples, the DRM applied may be compatible and readily
segment with the verification status and the DRM informa 55 consumable by standard DRM products. Examples include
tion. DRM supplied by Apple, Inc., such as FairPlay, Marlin
Verification computing device 6100 may communicate DRM developed and maintained by the Marlin Developer
with capture device 6010 over network 6110 through input Community, Adept DRM developed by Adobe, and DRM
and output module 6115 to receive information on segments developed by Amazon.com. In yet other examples, a pro
for verification and DRM selection. Input and output module 60 prietary DRM may be utilized. With some DRM systems,
6115 may also present one or more user interfaces and additional servers may be utilized to verify entitlements,
accept user input from a user of verification computing provide decryption keys, and the like. Thus some or all of
device 6100. Input and output module 6115 may receive the functionalities provided by the DRM modules of FIG. 6
segment information for verification from capture device may be on a separate server.
6010. The review module 6130 may present the segment 65 Certain embodiments are described herein as including
information (e.g., the audio or transcript of the audio) to the logic or a number of components, modules, or mechanisms.
user through the playback module 6140 and/or the input and Modules may constitute either software modules (e.g., code
US 9,495,967 B2
11 12
embodied on a machine-readable medium or in a transmis operations. Whether temporarily or permanently configured,
sion signal) or hardware modules. A hardware module is a Such processors may constitute processor-implemented
tangible unit capable of performing certain operations and modules that operate to perform one or more operations or
may be configured or arranged in a certain manner. In functions. The modules referred to herein may, in some
example embodiments, one or more computing devices example embodiments, comprise processor-implemented
(e.g., a standalone, client or server computing device) or one modules.
or more hardware modules of a computing device (e.g., a Similarly, the methods described herein may be at least
processor or a group of processors) may be configured by partially processor-implemented. For example, at least some
Software (e.g., an application or application portion) as a of the operations of a method may be performed by one or
hardware module that operates to perform certain operations 10
more processors or processor-implemented modules. The
as described herein.
In various embodiments, a hardware module may be performance of certain of the operations may be distributed
implemented mechanically or electronically. For example, a among the one or more processors, not only residing within
hardware module may comprise dedicated circuitry or logic a single machine, but deployed across a number of
that is permanently configured (e.g., as a special-purpose 15 machines. In some example embodiments, the processor or
processor, such as a field programmable gate array (FPGA) processors may be located in a single location (e.g., within
or an application-specific integrated circuit (ASIC)) to per a home environment, an office environment or as a server
form certain operations. A hardware module may also com farm), while in other embodiments the processors may be
prise programmable logic or circuitry (e.g., as encompassed distributed across a number of locations.
within a general-purpose processor or other programmable The one or more processors may also operate to Support
processor) that is temporarily configured by Software to performance of the relevant operations in a "cloud comput
perform certain operations. It will be appreciated that the ing environment or as a “software as a service' (SaaS). For
decision to implement a hardware module mechanically, in example, at least Some of the operations may be performed
dedicated and permanently configured circuitry, or in tem by a group of computers (as examples of machines including
porarily configured circuitry (e.g., configured by Software) 25 processors), with these operations being accessible via a
may be driven by cost and time considerations. network (e.g., the Internet) and via one or more appropriate
Accordingly, the term “hardware module' should be interfaces (e.g., APIs).
understood to encompass a tangible entity, be that an entity Example embodiments may be implemented in digital
that is physically constructed, permanently configured (e.g., electronic circuitry, or in computer hardware, firmware,
hardwired) or temporarily configured (e.g., programmed) to 30 software embodied in computer-readable medium, or in
operate in a certain manner and/or to perform certain opera combinations of them. Example embodiments may be
tions described herein. Considering embodiments in which implemented using a computer program product, for
hardware modules are temporarily configured (e.g., pro example, a computer program tangibly embodied in an
grammed), each of the hardware modules need not be information carrier, for example, in a machine-readable
configured or instantiated at any one instance in time. For 35 medium for execution by, or to control the operation of data
example, where the hardware modules comprise a general processing apparatus, for example, a programmable proces
purpose processor configured using software, the general Sor, a computer, or multiple computers.
purpose processor may be configured as respective different A computer program may be written in any form of
hardware modules at different times. Software may accord programming language, including compiled or interpreted
ingly configure a processor, for example, to constitute a 40 languages, and it may be deployed in any form, including as
particular hardware module at one instance of time and to a stand-alone program or as a module, Subroutine, or other
constitute a different hardware module at a different instance unit Suitable for use in a computing environment. A com
of time. puter program may be deployed to be executed on one
Hardware modules may provide information to, and computer or on multiple computers at one site or distributed
receive information from, other hardware modules. Accord 45 across multiple sites and interconnected by a communication
ingly, the described hardware modules may be regarded as network.
being communicatively coupled. Where multiple of such In example embodiments, operations may be performed
hardware modules exist contemporaneously, communica by one or more programmable processors executing a com
tions may be achieved through signal transmission (e.g., puter program to perform functions by operating on input
over appropriate circuits and buses) that connect the hard 50 data and generating output. Method operations may also be
ware modules. In embodiments in which multiple hardware performed by, and apparatus of example embodiments may
modules are configured or instantiated at different times, be implemented as, special purpose logic circuitry (e.g., a
communications between Such hardware modules may be FPGA or an ASIC).
achieved, for example, through the storage and retrieval of The computing system may include clients and servers. A
information in memory structures to which the multiple 55 client and server are generally remote from each other and
hardware modules have access. For example, one hardware typically interact through a communication network. The
module may perform an operation and store the output of relationship of client and server arises by virtue of computer
that operation in a memory device to which it is communi programs running on the respective computers and having a
catively coupled. A further hardware module may then, at a client-server relationship to each other. In embodiments
later time, access the memory device to retrieve and process 60 deploying a programmable computing system, it will be
the stored output. Hardware modules may also initiate appreciated that both hardware and software architectures
communications with input or output devices, and may require consideration. Specifically, it will be appreciated that
operate on a resource (e.g., a collection of information). the choice of whether to implement certain functionality in
The various operations of example methods described permanently configured hardware (e.g., an ASIC), in tem
herein may be performed, at least partially, by one or more 65 porarily configured hardware (e.g., a combination of Soft
processors that are temporarily configured (e.g., by Soft ware and a programmable processor), or a combination of
ware) or permanently configured to perform the relevant permanently and temporarily configured hardware may be a
US 9,495,967 B2
13 14
design choice. Below are set out hardware (e.g., machine) limited to, Solid-state memories, and optical and magnetic
and Software architectures that may be deployed, in various media. Specific examples of machine-readable media
example embodiments. include non-volatile memory, including by way of example,
FIG. 7 is a block diagram of machine in the example form semiconductor memory devices (e.g., Erasable Program
of a computing device 7000 within which instructions, for 5 mable Read-Only Memory (EPROM), Electrically Erasable
causing the machine to perform any one or more of the Programmable Read-Only Memory (EEPROM)) and flash
methodologies discussed herein, may be executed. For memory devices; magnetic disks such as internal hard disks
example, any one of the components shown in FIGS. 1 and and removable disks; magneto-optical disks; and CD-ROM
6 may be or contain one or more of the components and DVD-ROM disks.
described in FIG. 7. In alternative embodiments, the 10 The instructions 7024 may further be transmitted or
machine operates as a standalone device or may be con received over a communications network 7026 using a
nected (e.g., networked) to other machines. In a networked transmission medium. The instructions 7024 may be trans
deployment, the machine may operate in the capacity of a mitted using the network interface device 7020 and any one
server or a client machine in server-client network environ of a number of well-known transfer protocols (e.g., HTTP).
ment, or as a peer machine in a peer-to-peer (or distributed) 15 Examples of communication networks include a LAN, a
network environment. The machine may be a personal WAN, the Internet, mobile telephone networks, Plain Old
computer (PC), a notebook PC, a docking station, a wireless Telephone (POTS) networks, and wireless data networks
access point, a tablet PC, a set-top box (STB), a PDA, a (e.g., Wi-FiR and WiMAX(R) networks). The term “trans
cellular telephone, a Smartphone, a web appliance, a network mission medium’ shall be taken to include any intangible
router, Switch or bridge, or any machine capable of execut medium that is capable of storing, encoding or carrying
ing instructions (sequential or otherwise) that specify actions instructions for execution by the machine, and includes
to be taken by that machine. Further, while only a single digital or analog communications signals or other intangible
machine is illustrated, the term “machine' shall also be media to facilitate communication of such software. Net
taken to include any collection of machines that individually work interface 7020 may wirelessly transmit data and may
or jointly execute a set (or multiple sets) of instructions to 25 include an antenna.
perform any one or more of the methodologies discussed Although the present invention has been described with
herein. The machine may contain components not shown in reference to specific example embodiments, it will be evi
FIG. 7 or only a subset of the components shown in FIG. 7. dent that various modifications and changes may be made to
The example computing device 7000 includes a processor these embodiments without departing from the broader spirit
7002 (e.g., a central processing unit (CPU) (e.g., a computer 30 and scope of the invention. Accordingly, the specification
processor), a graphics processing unit (GPU) or both), a and drawings are to be regarded in an illustrative rather than
main memory 7004 and a static memory 7006, which a restrictive sense.
communicate with each other via an interconnect 7008, such Although an embodiment has been described with refer
as a bus. The computing device 7000 may further include a ence to specific example embodiments, it will be evident that
video display unit 7010 (e.g., a liquid crystal display (LCD) 35 various modifications and changes may be made to these
or a cathode ray tube (CRT)). The computing device 7000 embodiments without departing from the broader spirit and
may also include an alphanumeric input device 7012 (e.g., Scope of the invention. Accordingly, the specification and
a keyboard), a user interface (UI) navigation device 7014 drawings are to be regarded in an illustrative rather than a
(e.g., a mouse), a disk drive unit 7016, a signal generation restrictive sense. The accompanying drawings that form a
device 7018 (e.g., a speaker) and a network interface device 40 part hereof, show by way of illustration, and not of limita
7020. In some examples, the device may be or contain a tion, specific embodiments in which the Subject matter may
System on a Chip (SoC) comprising one or more of the be practiced. The embodiments illustrated are described in
components of FIG. 7. sufficient detail to enable those skilled in the art to practice
The disk drive unit 7016 includes a machine-readable the teachings disclosed herein. Other embodiments may be
medium 7022 on which is stored one or more sets of 45 used and derived therefrom, Such that structural and logical
instructions and data structures (e.g., software) 7024 Substitutions and changes may be made without departing
embodying or used by any one or more of the methodologies from the scope of this disclosure. This Detailed Description,
or functions described herein. The instructions 7024 may therefore, is not to be taken in a limiting sense, and the scope
also reside, completely or at least partially, within the main of various embodiments is defined only by the appended
memory 7004, static memory 7006, and/or within the pro 50 claims, along with the full range of equivalents to which
cessor 7002 during execution thereof by the computing Such claims are entitled.
device 7000, the main memory 7004 and the processor 7002 Thus, although specific embodiments have been illus
also constituting machine-readable media. trated and described herein, it should be appreciated that any
While the machine-readable medium 7022 is shown in an arrangement calculated to achieve the same purpose may be
example embodiment to be a single medium, the term 55 substituted for the specific embodiments shown. This dis
“machine-readable medium may include a single medium closure is intended to cover any and all adaptations or
or multiple media (e.g., a centralized or distributed database, variations of various embodiments. Combinations of the
and/or associated caches and servers) that store the one or above embodiments, and other embodiments not specifically
more instructions or data structures. The term “machine described herein, will be apparent to those of skill in the art
readable medium’ shall also be taken to include any tangible 60 upon reviewing the above description.
medium that is capable of storing, encoding or carrying In addition, in the foregoing Detailed Description, it may
instructions for execution by the machine and that cause the be seen that various features are grouped together in a single
machine to perform any one or more of the methodologies embodiment for the purpose of streamlining the disclosure.
of the present invention, or that is capable of storing, This method of disclosure is not to be interpreted as reflect
encoding or carrying data structures used by or associated 65 ing an intention that the claimed embodiments require more
with such instructions. The term “machine-readable features than are expressly recited in each claim. Rather, as
medium’ shall accordingly be taken to include, but not be the following claims reflect, inventive subject matter lies in
US 9,495,967 B2
15 16
less than all features of a single disclosed embodiment. Thus sponding at least one identified active speaker corresponding
the following claims are hereby incorporated into the for the at least one segment created the audio in the
Detailed Description, with each claim standing on its own as respective segment.
a separate embodiment. In example 13 the subject matter of any one or more of
Other Notes and Examples: examples 1-12 may optionally include comprising providing
Example 1 includes Subject matter (Such as a method, a conference bridge for a conference call.
means for performing acts, machine readable medium Example 14 includes or may optionally be combined with
including instructions which, when performed by a machine the subject matter of any one of examples 1-13 to include
cause the machine to perform acts, or an apparatus config Subject matter (such as a device, apparatus, or machine)
ured to perform) receiving a voice exemplar from each of a 10 comprising an audio capture module configured to: receive
plurality of individuals; recording an audio event; determin a voice exemplar from each of a plurality of individuals:
ing a plurality of audio event segments of the audio event, record an audio event; a control module configured to:
the audio event segments determined based upon changes in determine a plurality of audio event segments of the audio
event, the audio event segments determined based upon
at least one identified active speaker, each segment having at 15 changes in at least one identified active speaker, each
least one corresponding identified active speaker, the iden segment having at least one corresponding identified active
tification based upon the received voice exemplars; receiv speaker, the identification based upon the received voice
ing verification information for at least one segment from exemplars; an input and output module configured to:
the corresponding identified active speaker for the at least receive verification information for at least one segment
one segment; and responsive to receiving verification infor from the corresponding at least one identified active speaker
mation for at least one segment, producing a master audio of the at least one segment; and wherein the control module
file including the tagged segments and verification informa is configured to produce a master audio file including the
tion. tagged segments and verification information responsive to
In example 2, the Subject matter of example 1 may the input and output module receiving verification informa
optionally include, sending a verification request for the at 25 tion for the at least one segment.
least one segment to the corresponding at least one identified In example 15, the subject matter of any one or more of
active speaker for that segment. examples 1-14 wherein the audio file is a digital audio file.
In example 3 the subject matter of any one or more of In example 16 the subject matter of any one or more of
examples 1-2 may optionally include, wherein the Verifica examples 1-15 may optionally include wherein the input and
tion request includes an audio clip of the segment. 30 output module is configured to send a verification request for
In example 4 the Subject matter of any one or more of the at least one segment to the at least one corresponding
examples 1-3 may optionally include, comprising automati identified active speaker for that segment.
cally generating a transcript of each segment and wherein In example 17 the subject matter of any one or more of
the verification request includes the transcript of the seg examples 1-16 may optionally include wherein the verifi
ment. 35 cation request includes an audio clip of the segment.
In example 5 the subject matter of any one or more of In example 18 the subject matter of any one or more of
examples 1-4 may optionally include wherein the Verifica examples 1-17 may optionally include wherein the control
tion request includes biometric data. module is configured to automatically generate a transcript
In example 6 the Subject matter of any one or more of of each segment and wherein the verification request
examples 1-5 may optionally include, wherein the biometric 40 includes the transcript of the segment.
data includes a voice print of a recipient active speaker of the In example 19 the subject matter of any one or more of
Segment. examples 1-18 may optionally include wherein the verifi
In example 7 the subject matter of any one or more of cation request includes biometric data.
examples 1-6 may optionally include, comprising receiving In example 20 the subject matter of any one or more of
digital rights management information for a respective seg 45 examples 1-19 may optionally include wherein the biometric
ment from the at least one corresponding identified active data includes a voice print of a recipient active speaker of the
speaker for the respective segment, and wherein producing Segment.
a master audio file comprises including the digital rights In example 21 the Subject matter of any one or more of
management information in the master audio file. examples 1-20 may optionally include wherein the input and
In example 8 the subject matter of any one or more of 50 output module is configured to receive digital rights man
examples 1-7 may optionally include, wherein the digital agement information for each segment from the correspond
rights management information defines a set of access ing at least one identified active speaker for that segment,
permissions for a user group. and wherein the control module is configured to produce a
In example 9 the subject matter of any one or more of master audio file by at least including the digital rights
examples 1-8 may optionally include, wherein the set of 55 management information in the master audio file.
access permissions includes at least two of read access, In example 22 the Subject matter of any one or more of
write access, and distribute access. examples 1-21 may optionally include wherein the digital
In example 10 the subject matter of any one or more of rights management information defines a set of access
examples 1-9 may optionally include, wherein the method is permissions for a user group.
performed by a recording device. 60 In example 23 the subject matter of any one or more of
In example 11 the subject matter of any one or more of examples 1-22 may optionally include wherein the set of
examples 1-10 may optionally include, wherein the method access permissions includes at least two of read access,
is performed at least partially by a recording device and at write access, and distribute access.
least partially by a computing device. In example 24 the Subject matter of any one or more of
In example 12 the Subject matter of any one or more of 65 examples 1-23 may optionally include wherein the audio
examples 1-11 may optionally include wherein the verifi capture module, the control module and the input and output
cation information comprises a verification that the corre module is on a recording device.
US 9,495,967 B2
17 18
In example 25 the subject matter of any one or more of In example 32, the subject matter of any one or more of
examples 1-24 may optionally include wherein at least one examples 1-31 may optionally include (a playback module
of the audio capture module, the control module and the configured to, instructions which when executed cause the
input and output module is on a recording device and at least processor to perform the operations of, or method steps
one of the audio capture module, the control module and the 5 comprising): receiving a command to display a transcript of
input and output module is on a separate computing device. the master audio file from a user, determining for a particular
In example 26 the subject matter of any one or more of segment in the master audio file that the user has satisfied a
examples 1-25 may optionally include wherein the verifi DRM condition applied to that particular segment for a first
cation information comprises a verification that the corre identified active speaker, but not a second DRM condition
sponding at least one identified active speaker for that 10 applied to that particular segment for a second identified
segment created the audio in the segment. active speaker, responsive to determining that the user has
In example 27 the subject matter of any one or more of satisfied a DRM condition applied to that particular segment
examples 1-26 may optionally include wherein the control for a first identified active speaker, but not a second DRM
module is configured to provide a conference bridge for a condition applied to that particular segment for a second
conference call. 15 identified active speaker, displaying the transcript for por
In example 28, the subject matter of any one or more of tions of the segment in which the first identified active
examples 1-27 may optionally include (a playback module speaker is speaking and refraining from displaying the
configured to, instructions which when executed cause the transcript for portions of the segment in which the second
processor to perform the operations of, or method steps identified active speaker is speaking.
comprising): receiving a command to play back the master In example 33, the subject matter of any one or more of
audio file from a user, determining for a particular segment examples 1-32 may optionally include (a playback module
in the master audio file that the user has not satisfied a DRM configured to, instructions which when executed cause the
condition applied to that particular segment; responsive to processor to perform the operations of, or method steps
determining that the user has not satisfied the DRM condi comprising): receiving a command to display a transcript of
tion applied to that particular segment, refraining from 25 the master audio file from a user; determining for whether
playing the audio of that segment. the user has satisfied each of a plurality of DRM conditions
In example 29, the subject matter of any one or more of applied to respective ones of the plurality of audio event
examples 1-28 may optionally include (a playback module segments; and displaying the transcript only if the user has
configured to, instructions which when executed cause the satisfied each of a plurality of DRM conditions applied to
processor to perform the operations of, or method steps 30 respective ones of the plurality of audio event segments.
comprising): receiving a command to play back the master What is claimed is:
audio file from a user; determining for a particular segment 1. A method of recording audio comprising:
in the master audio file that the user has satisfied a DRM using one or more processors to perform operations of:
condition applied to that particular segment for a first receiving a voice exemplar from a plurality of participants
identified active speaker, but not a second DRM condition 35 in an audio event;
applied to that particular segment for a second identified recording audio of the audio event;
active speaker, responsive to determining that the user has determining a first audio event segment and a second
satisfied a DRM condition applied to that particular segment audio event segment of the recorded audio, the first and
for a first identified active speaker, but not a second DRM second audio event segments determined based upon
condition applied to that particular segment for a second 40 identifying changes in at least one identified active
identified active speaker, playing portions of the segment in speaker, the first audio event segment having a first set
which the first identified active speaker is speaking and of one or more active speakers and the second audio
refraining from playing the portions of the segment in which event segment having a second set of one or more
the second identified active speaker is speaking. active speakers, the changes in at least one identified
In example 30, the subject matter of any one or more of 45 active speaker identified based upon matching the
examples 1-29 may optionally include (a playback module received voice exemplars with audio in the first and
configured to, instructions which when executed cause the second audio event segments;
processor to perform the operations of, or method steps receiving verification information for the first audio event
comprising): receiving a command to play back the master segment from at least one person from the first set of
audio file from a user; 50 active speakers for the first audio event segment, the
determining for whether the user has satisfied each of a Verification information indicating an opinion of the at
plurality of DRM conditions applied to respective ones of least one person as to whether the first audio event
the plurality of audio event segments; and playing the audio segment is an accurate reproduction of the audio event;
only if the user has satisfied each of a plurality of DRM receiving first digital rights management (DRM) infor
conditions applied to respective ones of the plurality of 55 mation for the first audio event segment from the at
audio event segments. least one person from the first set of active speakers;
In example 31, the subject matter of any one or more of receiving second DRM information for the second audio
examples 1-30 may optionally include (a playback module event segment, the first DRM information specifying a
configured to, instructions which when executed cause the usage restriction on the first audio event segment and
processor to perform the operations of, or method steps 60 the second DRM information specifying a usage
comprising): receiving a command to display a transcript of restriction on the second audio event segment, wherein
the master audio file from a user, determining for a particular the first and second DRM information specify different
segment in the master audio file that the user has not satisfied usage restrictions; and
a DRM condition applied to that particular segment; respon responsive to receiving verification information, first
sive to determining that the user has not satisfied the DRM 65 DRM information, and second DRM information, pro
condition applied to that particular segment, refraining from ducing a master audio file including the first and second
displaying the transcript of that segment. audio event segments, and the verification information,
US 9,495,967 B2
19 20
wherein the master audio file implements a first DRM restriction on the second audio event segment, wherein
restriction corresponding to the first DRM information the first and second DRM information specify different
for the first audio event segment and a second DRM usage restrictions; and
restriction corresponding to the second DRM informa responsive to receiving verification information, first
tion for the second audio event segment. 5 DRM information, and second DRM information, pro
2. The method of claim 1, comprising sending a verifi ducing a master audio file including the first and second
cation request for the first audio event segment to the at least audio event segments, and the verification information,
one person from the first set of active speakers. wherein the master audio file implements a first DRM
3. The method of claim 2, wherein the verification request 10 restriction corresponding to the first DRM information
includes an audio clip of the first audio event segment. for the first audio event segment and a second DRM
4. The method of claim 2, comprising automatically restriction corresponding to the second DRM informa
generating a transcript of the first audio event segment and tion for the second audio event segment.
wherein the verification request includes the transcript of the 10. The machine readable medium of claim 9, wherein the
first audio event segment. 15 instructions further include instructions, which when per
5. The method of claim 1, wherein the digital rights formed by the machine, cause the machine to perform the
management information defines a set of access permissions operations of sending a verification request for the first
for a user group. audio event segment to the at least one person from the first
6. The method of claim 5, wherein the set of access set of active speakers.
permissions for the first audio event segment includes at 2O 11. The machine readable medium of claim 10, wherein
least two of read access, write access, and distribute access. the verification request includes an audio clip of the first
7. The method of claim 1, wherein the method is per audio event segment.
formed by a recording device. 12. The machine readable medium of claim 10, wherein
8. The method of claim 1, comprising: the instructions further include instructions, which when
receiving a command to play back the master audio file 25 performed by the machine, cause the machine to perform the
from a user; operations of automatically generating a transcript of the
determining that the user has not satisfied the usage first audio event segment and wherein the verification
restriction applied to the first audio event segment; request includes the transcript of the first audio event
responsive to determining that the user has not satisfied 30 Segment.
the first DRM restriction applied to the first audio event the13.verification
The machine readable medium of claim 10, wherein
request includes biometric data.
segment, refraining from playing the audio of the first 14. A system for recording audio comprising:
audio event segment; and a computer processor,
responsive to determining that the user has satisfied the a memory, communicatively coupled to the computer
second DRM restriction applied to the second audio 35 processor, and comprising instructions, which when
event segment, playing the audio of the second audio performed by the computer processor, causes the sys
event segment. tem to perform operations to:
9. A machine readable medium that stores instructions, receive a Voice exemplar from a plurality of participants
which when performed by a machine, cause the machine to in an audio event;
perform operations comprising: 40 record audio of the audio event;
receiving a voice exemplar from each of a plurality of determine a first audio event segment and a second audio
participants in an audio event; event segment of the recorded audio, the first and
recording audio of the audio event; second audio event segments determined based upon
determining a first audio event segment and a second identifying changes in at least one identified active
audio event segment of the recorded audio, the first and 45 speaker, the first audio event segment having a first set
second audio event segments determined based upon of one or more active speakers and the second audio
identifying changes in at least one identified active event segment having a second set of one or more
speaker, the first audio event segment having a first set active speakers, the changes in at least one identified
of one or more active speakers and the second audio active speaker identified based upon matching the
event segment having a second set of one or more 50 received voice exemplars with audio in the first and
active speakers, the changes in at least one identified second audio event segments;
active speaker identified based upon matching the receive verification information for the first audio event
received voice exemplars with audio in the first and segment from at least one person from the first set of
second audio event segments; active speakers for the first audio event segment, the
receiving verification information for the first audio event 55 Verification information indicating an opinion of the at
segment from at least one person from the first set of least one person as to whether the first audio event
active speakers, for the first audio event segment, the segment is an accurate reproduction of the audio event;
verification information indicating an opinion of the at receive first digital rights management (DRM) informa
least one person as to whether the first audio event tion for the first audio event segment from the at least
segment is an accurate reproduction of the audio event; 60 one person from the first set of active speakers;
receiving first digital rights management (DRM) infor receive second DRM information for the second audio
mation for the first audio event segment from the at event segment, the first DRM information specifying a
least one person from the first set of active speakers; usage restriction on the first audio event segment and
receiving second DRM information for the second audio the second DRM information specifying a usage
event segment, the first DRM information specifying a 65 restriction on the second audio event segment, wherein
usage restriction on the first audio event segment and the first and second DRM information specify different
the second DRM information specifying a usage usage restrictions; and
US 9,495,967 B2
21 22
produce a master audio file including the first and second 18. The system of claim 17, wherein the set of access
audio event segments, and the verification information, permissions for the first audio event segment includes at
wherein the master audio file implements a first DRM least two of read access, write access, and distribute access.
restriction corresponding to the first DRM information 19. The system of claim 14, wherein the memory com
for the first audio event segment and a second DRM 5 prises instructions, which when performed by the computer
restriction corresponding to the second DRM informa processor, causes the system to perform operations to:
tion for the second audio event segment responsive to receive a command to play back the master audio file from
a user,
receipt of the verification information. determine that the user has satisfied the usage restriction
15. The system of claim 14, wherein the operations 10 applied to the first audio event segment;
comprise operations to send a verification request for the responsive to determining that the user has not satisfied
first audio event segment to the at least one person from the the first DRM restriction applied to the first audio event
first set of active speakers. segment, refraining from playing the audio of the first
audio event segment; and
16. The system of claim 15, wherein the verification responsive to determining that the user has satisfied the
request includes a transcript of the first audio event segment. 15
second DRM restriction applied to the second audio
17. The system of claim 14, wherein the digital rights event segment, playing the audio of the second audio
management information defines a set of access permissions event segment.
for a user group.

You might also like