Automation งวดนี้จะทำให้โลกเปลี่ยนไปตลอดกาล และมีผลกระทบต่องานปัจจุบันแน่นอน        
Automation งวดปีนี้จะไม่เหมือนเดิม เมื่อเทียบกับรอบแรกเมื่อ 50 ปีที่แล้วที่เป็นยุคปฏิวัติอุตสาหกรรม ผมมีโอกาสได้ศึกษาและเรียนรู้เกี่ยวกับ การใช้งานเครื่องจักร เพื่อทำให้การทำงานมีประสิทธิภาพ และการออกแบบระบบ หรือวิธีการเพื่อทำให้คนงาน (ที่เป็นคน) ทำงานร่วมกับเครื่องจักรได้อย่างมีประสิทธิภาพ เพื่อให้ผลิตภาพที่ดีในโรงงานอุตสาหกรรม ทั้งนี้เรื่องที่เคยเรียนรู้มาทั้งหมด มีการเปลี่ยนแปลงไปมากเนื่องจากเทคโนโลยีด้านคอมพิวเตอร์และการสื่อสารที่ทำให้เกิดขึ้นมูลที่สื่อสารกันได้ระหว่างเครื่องจักร และการเชื่อมต่อของข้อมูลผ่านอินเตอร์เน็ตที่เข้าถึงได้อย่างดาษดื่นในโลกปัจจุบัน แต่แล้ววันนี้ ก็มีการเปลี่ยนแปลงครั้งใหญ่อีกครั้งหนึ่ง คือ พวก Google Amazon หรือ Apple เองก็กำลังจะเข้าถึงการเปลี่ยนถ่ายเทคโนโลยีการออกแบบโปรแกรมเพื่อให้ตรรกะทำงาน ด้วยรูปแบบการกำหนดเงื่อนไขเอาไว้ก่อนล่วงหน้า มาเป็นทำการให้ตรรกะเหล่านั้นมีการเปลี่ยนแปลงได้เอง และ เรียนรู้ที่จะเปลี่ยนแปลงได้เองโดยการฝึกจากข้อมูลตัวอย่าง หรือวิธีการทำงานตัวอย่าง ซึ่งใช้คำกันอยู่ในตอนนี้คือ Machine Learning...
          Customer Success Manager - OneSpot        
Austin, TX - Client Services - Austin, TX - Full Time
Company Overview

OneSpot is a technology platform for sitting at the intersection of personalization technology and content marketing. The company?s machine learning based Content Sequencing engine helps the world?s best brands use their
          Will Machine Learning Revolutionize Healthcare (and other) Collections?        
The intersection of behavioral psychology, classic marketing strategy and tech for the masses—including machine learning—has added tremendous ...
          Nielsen Acquires AI-Powered Sports Marketing Startup VBrand        
vBrand has developed a machine learning-enabled platform to measure brand exposure and impact in sports programming. The acquisition of ...
          Andrew Ng will help you change the world with AI if you know calculus and Python        
Coursera was originally set up to offer an online class in machine learning; deep learning is a variety of that, involving exceptionally large datasets.
          Nvidia: Luck Is Not A Moat        
NVIDIA itself is not an AI powerhouse, and its growth has come from the happy coincidence that GPUs are more suited to machine learning than CPUs ...
          Rare Disease Treatments to Be Discovered by Machine Learning and Simulation Platform        
"I look forward to combining the GNS REFS platform with Alexion's deep expertise in data sciences to accelerate the discovery of innovative medicines ...
          Machine Learning Model Tracks US Spy Planes        
Reporters-turned-data-scientists started by making calculations describing the flight characteristics of about 20,000 aircraft contained in a database ...
          Machine Learning Model Tracks US Spy Planes        
The news web site BuzzFeed did just that, reporting this week that it employed a machine-learning algorithm to first recognize known spy planes, and ...
          Car Problems? AI And Machine Learning To The Rescue!        
Recalling faulty products is difficult in any industry. The fallout can be horrific, and have serious implications for companies of all sizes - just look at the ...
          Car Problems? AI And Machine Learning To The Rescue!        
... of 'Digital Twins', granular virtual copies of parts in the manufacturing process, which are enabled by deep learning and artificial intelligence.
          How Machine Learning Is Helping Neuroscientists Crack Our Neural Code        
A big challenge in neuroscience is understanding how the brain encodes information. Neural networks are turning out to be great code crackers.
          5 Red Hot Picks for Electrifying Gains From AI & Automation        
Two catchphrases have caught the imagination of investors and companies at the moment, deep learning networks and machine learning. Both these ...
          F# + ML |> MVP Summit: Talk recordings, slides and source code        

I was fortunate enough to make it to the Microsoft MVP summit this year. I didn't learn anything secret (and even if I did, I wouldn't tell you!) but one thing I did learn is that there is a lot of interest in data science and machine learning both inside Microsoft and in the MVP community. What was less expected and more exciting was that there was also a lot of interest in F#, which is a perfect fit for both of these topics!

When I visited Microsoft back in May to talk about Scalable Machine Learning and Data Science with F# at an internal event, I ended up chatting with the organizer about F# and we agreed that it would be nice to do more on F#, which is how we ended up organizing the F# + ML |> MVP Summit 2015 mini-conference on the Friday after the summit.


          Special 322: WWDC 2017        

TWiT Live Specials (Video-HD)

At their 2017 World Wide Developers Conference, Apple announced six new things. tvOS gets an update and Amazon Prime Video. watchOS 4 includes new faces and functionality. macOS High Sierra will incorporate Apple File System and improvements to Safari, Mail, and Photos. MacBook Pros and iPads get a spec bump, and iMacs can now handle VR with the HTC Vive. The iMac Pro, coming later this year, might fill the MacPro-sized hole in power users' hearts. iOS 11 will include a slew of new features and improvements, most notably machine learning and augmented reality. The new 10.5" iMac Pro further blurs the line between tablets and power computers. And last, but surely not least, the new HomePod is the rumored "Siri Speaker," with a big focus on music.

Hosts: Nathan Olivarez-Giles and Megan Morrone

Download or subscribe to this show at https://twit.tv/shows/twit-live-specials.

Thanks to CacheFly for the bandwidth for this special presentation.


          4 Ways Machine Learning Helps Companies Acknowledge Its Audience        
Here’s an icebreaker: what’s your routine internet behavior? Are your Fourth of July celebrations more likely to start with yourself scouting out holiday sales than a picnic spot? Is checking social media updates as important...
          Who Sets Policy?        
In April the New York Times Magazine ran an article Is it O.K. to Tinker with the Environment to Fight Climate Change?  The article asks about the ethics of even running tests on such methods and has this quote froms David Battisti, an atmospheric scientist at UW.
Name a technology humans have developed that they haven't used. I can't think of any. So we can work on this for sure. But we are in this dilemma: Once we do develop this technology, it will be tempting to use it.
The article skirts the question on who makes this decision. Maybe the United Nations after some unlikely agreement among major powers. But what if the UN doesn't act and some billionaire decides to fund a project?

As computer scientists we start to face these questions as software in our hyper-connected world starts to change society in unpredictable ways. How do we balance privacy, security, usability and fairness in communications and machine learning? What about net neutrality, self-driving cars, autonomous military robots? Job disruption from automation?

We have governments to deal with these challenges. But the world seems to have lost trust in its politicians and governments don't agree. How does one set different rules across states and countries which apply to software services over the Internet?

All too often companies set these policies, at least the default policies until government steps in. Uber didn't ask permission to completely change the paid-ride business and only a few places pushed back. Google, Facebook, etc. use machine learning with abandon, until some governments try and reign them in. The Department of Defense and the NSA, in some sense industries within government, set their policies often without public debate.

What is our role as computer scientists? It's not wrong to create the technologies, but we should acknowledge the ethical questions that come with them and what we technically can and cannot do to address them. Keep people informed so the decision makers, whomever they be, at least have the right knowledge to make their choices.
          å®Ÿã‚¿ã‚¹ã‚¯ã§æ©Ÿæ¢°å­¦ç¿’を導入するまでの壁とその壁の突破方法        

社内で機械学習の案件があった際に、機械学習の経験者しか担当できないと後々の引き継ぎで問題が起こりがちです。これを防ぐために、機械学習に興味があり、これまで機械学習を経験したことがないエンジニアにも担当できる体制を整えられることが望ましいです。しかし、機械学習のことに詳しく知らないディレクターやエンジニアにとっては、どのような機械学習の理解段階ならばタスクを任せられるかの判断をするのはなかなか困難です。そこで、このエントリでは機械学習を実タスクでやるまでに乗り越えるべき壁だと私が思っているものについて説明します。

第一の壁: 綺麗なデータで機械学習の問題を解ける

  • 講義で扱われるような綺麗なデータを扱える
    • 行列形式になっていて、欠損値や異常値もない
  • 上記のデータを回帰や分類問題として解くことができる
    • 実際に解く際にはライブラリを使って解いてよい
    • 手法を評価する上で何を行なえばよいか(PrecisionやRecallやRMSEなどを計測する)知っている
  • アルゴリズムが導出できなくてもよいが、その手法がどういう問題を解いているのかは大雑把に分かる
    • 回帰/分類だったらどういう目的関数を最適化しているか
    • その目的関数の心はどういうものか
    • 学習や予測がうまくいかないとき、どこに着目すればよいか分かる

第二の壁: 綺麗でない現実のデータで機械学習の問題を解ける

  • 整形されていないデータを機械学習で扱える特徴量の形式特徴量に落としこめる
    • 例: テキストデータをBoW形式に落とせる
    • 例: 異常値、外れ値は前処理で落とせる
  • 必要なパフォーマンスが出るようにチューニングができる
    • 特徴量選択、正則化項、頻度での足切りによるチューニング、ラベル数の偏りがある場合などへの対応
    • 「多少のゴミがあってもいいから取りこぼしが少ないようにして欲しい」「取りこぼしがあってもいいからとにかく綺麗な結果を見せたい」等の要望があったときにどうチューニングすればいいか想像が付く

第三の壁: 機械学習の問題としてどう定義し、サービスに導入していくか

  • その問題は機械学習で解ける問題なのか、そもそも機械学習で解くべき問題なのかの判断が付く
  • 機械学習で解くことが決まったとして、どのように定式化するのか
    • 例: ランキング問題として定式化するのか、分類問題の組み合わせで対処するのか
  • 学習用データや評価用データをどうやって作る/集めるか
    • そもそもどの程度学習/評価データがあればある程度安定して学習/評価できるか知っている
    • 一貫性のあるデータ作りのガイドラインを作れるか(結構難しい…)
    • 闇雲に教師データを追加しても精度は上がるとは限らない

これらの壁をどう乗り越えていくか?

  • 第二段階までの壁は機械学習に関する書籍やBlogやライブラリが最近は山のように存在するので、ハードルは相当下がっている
    • はてなの教科書もあります!
    • 逆に教材がありすぎて何からやればよいか分からない…、という悩みは最近はあると思うので、問題に合わせた最短経路を経験者が示せるとよい
    • kaggle等で未経験者と経験者で同じ問題を解き、どのような工夫をすると精度が上がるか等を一緒にやってみる
  • 第三の壁は経験によるところが大きいため、経験者がメンター的に付いてアドバイスしながら手を動かせるとよい

第四の壁: 機械学習導入後の運用

言語処理のための機械学習入門 (自然言語処理シリーズ)

言語処理のための機械学習入門 (自然言語処理シリーズ)


          ä¸å®šæœŸML&NLPå ±#4        

最近の機械学習&自然言語処理に関する情報をまとめるコーナーです。前回はこちら。このエントリ忘れてるよというのがありましたら、たれこみフォームから教えてもらえるとうれしいです。

論文

  • [1701.07875] Wasserstein GAN
    • GANを含む生成系のタスクは難しいことが知られているが、学習時に使う距離をWasserstein距離というものを使うと学習が安定したという話

ブログ/勉強会資料

speakerdeck.com

機械学習モデルのバージョン管理をするstarchartというツールは知らなかった。解説記事もあった。

speakerdeck.com

機械学習というよりはログ基盤的な話。

AWS Athenaを使って 30TBのWeb行動ログ集計を試みた from Tetsutaro Watanabe
www.slideshare.net

ビジネス

学会/勉強会

NIPS読み会

Kaggle Tokyo Meetup #2

全脳アーキテクチャ若手の会

AAAI2017

その他

機械学習のための連続最適化 (機械学習プロフェッショナルシリーズ)

機械学習のための連続最適化 (機械学習プロフェッショナルシリーズ)

関係データ学習 (機械学習プロフェッショナルシリーズ)

関係データ学習 (機械学習プロフェッショナルシリーズ)


          ä¸å®šæœŸML&NLPå ±#2        

最近の機械学習&自然言語処理に関する情報をまとめるコーナーです。今回は医療品設計やセキュリティなど、自分があまり知らなかった分野での機械学習適用事例が多く、勉強になるものが多かったです。前回はこちら。

このエントリ忘れてるよというのがありましたら、たれこみフォームから教えてもらえるとうれしいです。

論文

  • [1612.03242] StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
    • テキスト(キャプション)からの画像生成系のタスクでGAN(Generative Adversarial Networks)がよく使われているが、GANを多段にする(最初は荒い画像を作って、積んだやつで精密にする)ことでかなり精密な生成ができるようになったという論文。GANをあまり追っていないけれど、出てきている画像がぼやっとしていないのでかなりびびる
  • 統計的自然言語処理と情報理論
    • 統数研の持橋さんが情報理論研究会での招待講演で話された発表資料
    • 中野さんと一緒にやっておられる典型度の定量化の話は知らなかったので面白かった(論文)

ブログ/勉強会資料

ビジネス

  • Googleの自動運転車ユニット、Waymoとして独立―クライスラーと提携して事業展開も | TechCrunch Japan
    • 自動運転車の開発を断念という話が一瞬出ていたけど、飛ばし記事だった様子
  • 言葉に表れない相手の本音が分かる! 会話中の満足感を定量化するAI(人工知能)技術 : FUJITSU JOURNAL(富士通ジャーナル)
    • アクセントやイントネーションによってコールセンターでの会話の満足度を機械学習を使って特定
  • How we learn how you learn | Making Duolingo Blog
    • 語学学習アプリを作っているduolingo社が機械学習を使ってどのようにユーザーが学習すべき単語を出しているかという話(論文)
    • 単純に精度が上がったという話だけでなく、サービスにどのような影響があったか(学習が効率的になってユーザーがonline fome等を使ってくれるようになったなど)も書かれているので参考になる
    • 社内で元になった論文をすごく簡単に紹介したときの資料があったので貼っておきます

Duolingo.pptx from syou6162

学会/勉強会

NIPS2016

  • NIPS 2016参加報告 - Qiita
  • www.reddit.com/r/MachineLearning/comments/5hwqeb/project_all_code_implementations_for_nips_2016
    • NIPS2016で発表された論文の中でgithub等で実装が公開されているもの一覧。すでに20個以上ある様子
    • 実装やデータセットが公開されるとレビューに加点があったり、ということで公開される流れができつつある。再現性の観点からも望ましい

その他

データセット

  • GitHub - yahoojapan/YJCaptions
    • Yahooが画像の日本語のキャプションデータを公開(約12万件)。元になっている画像データはMS COCOはマイクロソフトが出しているもの
  • FMA: A Dataset For Music Analysis

  • これもMS COCOに対してアノテーションしているデータセットの1つだった


          ä¸å®šæœŸML&NLPå ±#1        

先日、社内で定期的に行なわれているフロントエンド会にお邪魔してきました(podcastが配信されています)。jser.infoというサイトを参照しながら雑談していたのですが、最近のフロントエンドの動向を知るという目的にはこのサイトなかなかよさそうでした。

機械学習勉強会でもランチタイムに最近の話題を見ながら雑談しているのですが、ネタになるエントリ一覧とそれに対するコメントは社外に公開して別に問題ないなと思ったので、不定期報という形で出してみることにしました。自然言語処理も自分がカバーできる範囲限られているし、自然言語処理以外の機械学習の話はかなりカバーできないので、たれこみフォームも作りました。耳寄りな情報、お待ちしております:)

論文

ブログ/勉強会資料

ビジネス

学会/勉強会

Coling2016

今年は大阪で開催。

NIPS2016

NL研(第229回自然言語処理研究会)

(2) [NLC] ゲーミフィケーションを利用した効率的な対話ログ収集の試み
○叶内 晨・小町 守(首都大東京)

データが集まると学習する方法はいくらでも出てきているので、データをどうやって効率よく集めていくのかというところに興味がある。

(5) [NL] 雑談対話システムの話題遷移における自然性の自動評価
○豊嶋章宏(NAIST)・杉山弘晃(NTT)・吉野幸一郎・中村 哲(NAIST)

(20) [NL] 14:30 – 15:00
単語分散表現を用いた単語アライメントによる日英機械翻訳の自動評価尺度
○松尾潤樹・小町 守(首都大)・須藤克仁(NTT)

データ収集と合わせて解析系でないタスクの評価は今後ホットなタスクになると思ってます。

(15) [NL] 17:25 – 17:55
単語分かち書き用辞書生成システム NEologd の運用 — 文書分類を例にして —
○佐藤敏紀・橋本泰一(LINE)・奥村 学(東工大)

最近各地で使われることが多くなってきたNEologdの話もあった。

言語処理学会2017

チュートリアルとテーマセッション、ワークショップの内容も出ていました。

クラウドソーシング
馬場 雪乃 先生(京都大学)
ニューラル機械翻訳
中澤 敏明 先生 (JST)
Universal Dependencies
金山 博 先生(日本IBM東京基礎研究所)
田中 貴秋 先生(NTTコミュニケーション科学基礎研究所)
認知言語学
西村 義樹 先生(東京大学)

ニューラル機械翻訳とUniversal Dependenciesが特に気になっている。

IM飲み2016

その他


          Google is harnessing machine learning to cut data center energy        
Originally posted on Gigaom:
Leave it to Google (s GOOG) to have an engineer so brainy he hacks out machine learning models in his 20 percent time. Google says that recently it’s been using machine learning — developed by data center engineer Jim Gao (his Googler nickname is “Boy Genius”) — to predict the energy efficiency…
          Mastering Java Machine Learning        

eBook Details: Paperback: 556 pages Publisher: WOW! eBook (August 4, 2017) Language: English ISBN-10: 1785880519 ISBN-13: 978-1785880513 eBook Description: Mastering Java Machine Learning: Become an advanced practitioner with this progressive set of master classes on application-oriented machine learning

The post Mastering Java Machine Learning appeared first on WOW! eBook: Free eBooks Download.


          The machines are coming: A rethink of Africa’s employment is needed        
Opinion, AI, Machines, Africa, Workforce, Rethink, Change
The Fourth Industrial Revolution will dramatically reshape the world of work and force us to rethink our approach to our careers, our lives, and our aspirations. With a global market estimated to reach $70 billion by 2020, machine learning is driving fundamental change in the way every industry operates. Learning algorithms are already pioneering advances [&hellip
          Hypersphere Volume        
In our last article we looked at how the dimension of data space impacts Machine Learning algorithms.  This is often referred to as the curse of dimensionality. At the heart of the article we discussed the fact that an hypersphere hyper-volume trends to zero as dimension increases. Here I want to demonstrate how to find […]
          Hyperspheres & the curse of dimensionality        
I previously talked about the curse of dimensionality (more than 2 years ago) related to Machine Learning. Here I wanted to discuss it in more depth and dive into the mathematics of it. High dimensions might sound like Physics’ string theory where our universe is made of more than 4 dimensions.  This isn’t what we […]
          Large-Scale MOO Experiments with SHARK – Oracle Grid Engine        
This post explains how to conduct large-scale MOO experiments with the SHARK machine learning library on clusters running Oracle grid engine. An experiment consists of three phases: front approximation performance indicator calculation result accumulation and statistics calculation Within this post, I’m going to focus on the first step. Front Approximation In this phases, the Pareto […]
          Shark 3.x – Continuous Integration        
Taken from the SHARK website: SHARK is a modular C++ library for the design and optimization of adaptive systems. It provides methods for linear and nonlinear optimization, in particular evolutionary and gradient-based algorithms, kernel-based learning algorithms and neural networks, and various other machine learning techniques. SHARK serves as a toolbox to support real world applications […]
          DigitalGlobe’s Tony Frazier: Govt Leaders Recognize Need to Leverage Machine Learning, Big Data Analytics        
Tony Frazier, senior vice president of government solutions at DigitalGlobe, has said the U.S. government acknowledges the need to leverage big data analytics, machine learning, automation and other commercial technology platforms in order to help transform the intelligence community and global mapping efforts. Frazier wrote in a blog post published Friday that government leaders such as Robert Cardillo, director […]
          AI – What Chief Compliance Officers Care About        

AI conference logo

Arguably, there are more financial institutions located in the New York metropolitan area than anywhere else on the planet, so it was only fitting for a conference on AI, Technology Innovation & Compliance to be held in NYC – at the storied Princeton Club, no less. A few weeks ago I had the pleasure of speaking at this one-day conference, and found the attendees’ receptivity to artificial intelligence (AI), and creativity in applying it, to be inspiring and energizing. Here’s what I learned.

CCOs Want AI Choices

As you might expect, the Chief Compliance Officers (CCOs) attending the AI conference were extremely interested in applying artificial intelligence to their business, whether in the form of machine learning models, natural language processing or robotic process automation – or all three. These CCOs already had a good understanding of AI in the context of compliance, knowing that:

  • Working the sets of rules will not find “unknown unknowns”
  • They should take a risk-based approach in determining where and how to divert resources to AI-based methods in order to find the big breakthroughs.

All understood the importance of data, and how getting the data you need to provide to the AI system is job number one. Otherwise, it’s “garbage in, garbage out.” I also discussed how to provide governance around the single source of data, the importance of regular updating, and how to ensure permissible use and quality.

AI Should Explain Itself

Explainable AI (XAI) is a big topic of interest to me, and among the CCOs at the conference, there was an appreciation that AI needs to be explainable, particularly in the context of compliance with GDPR. The audience also recognized that their organizations need to layer in the right governance processes around model development, deployment, and monitoring––key steps in the journey toward XAI. I reviewed the current state of art of Explainable AI methods, and where their road leads to getting AI that is more grey-boxed.

Ethics and Safety Matter

In pretty much every AI conversation I have, ethics are the subject of lively discussion. The New York AI conference was no exception. The panel members and I talked about how any given AI system is not inherently ‘ethical’; it learns from the inputs it’s given. The modelers who build the AI system need to not pass sensitive data fields, and those same modelers need to examine if inadvertent biases are derived from the inputs in the training of the machine learning model.

Here, I was glad to be able to share some of the organizational learning FICO has accumulated over decades of work in developing analytic models for the FICO® Score, our fraud, anti-money laundering (AML) products and many others.

AI safety was another hot topic. I shared that although models will make mistakes and there needs to be a risk-based approach, machines are often better than human decision-making, such as autopilots on airplanes. Humans need to be there to step in if something is changing, to the degree that the AI system may not make an optimal decision. This could arise as a change in environment or data character.

In the end, an AI system will work with the data on which it has trained, and is trained to find patterns in it, but the model itself is not necessarily curious; the model is still constrained by the algorithm development, data posed in the problem, and the data it trains on.

Open Source Is Risky

Finally, the panel and I talked about AI software and development practices, including the risks of open source software and open source development platforms. I indicated that I am not a fan of open source, as it often leads to scientists using algorithms incorrectly, or relying on someone else’s implementation. Building an AI implementation from scratch, or from an open source development platform, gives data scientists more hands-on control over the quality of the algorithms, assumptions, and ultimately the AI model’s success in use.

I am honored to have been invited to participate in Compliance Week’s AI Innovation in Compliance conference. Catch me at my upcoming speaking events in the next month: The University of Edinburgh Credit Scoring and Credit Control XV Conference on August 30-September 1, and the Naval Air Systems Command Data Challenge Summit.

In between speaking gigs I’m leading FICO’s 100-strong analytics and AI development team, and commenting on Twitter @ScottZoldi. Follow me, thanks!

The post AI – What Chief Compliance Officers Care About appeared first on FICO.


          Three Keys to Advancing your Digital Transformation        

Digital assets

With today’s proliferation of data, digital transformation (DX) has become more than a hot topic: It’s an imperative for businesses of all shapes and sizes. The collision of data, analytics and technology has businesses, analysts and consumers excited — and scared — about what could happen next.

On one hand, everyone from banks to bagel shops and travel sites to tractor manufacturers have found new ways to connect the dots in their businesses while forging stronger, more dynamic customer engagement. Artificial intelligence (AI) has come of age in technologies such as smart sensors, robotic arms, and devices that can turn lights and heat on and off, adjust for changes in conditions and preferences, and even automatically reorder food and supplies for us.

However, today's Chief Analytics Officer (and Chief Data Officer and Chief Digital Officer, for example) faces both the promise and precariousness of digitizing business. While significant opportunities abound to drive revenues and customer connectivity, any leader will freely confess there are myriad technological, business and human obstacles to transforming even one element of business, introducing a new unique product or even meeting regulatory requirements.

The Big Data Dilemma

Big Data is at once the promise of the DX and its biggest roadblock. A recent Harvard Business Review article put it succinctly: “Businesses today are constantly generating enormous amounts of data, but that doesn’t always translate to actionable information.”

When 150 data scientists were asked if they had built a machine learning model, roughly one-third raised their hands. How many had deployed and/or used this model to generate value, and evaluated it? Not a single one.

This doesn’t invalidate the role of Big Data in achieving DX. To the contrary: The key to leveraging Big Data is understanding what its role is in solving your business problems, and then building strategies to make that happen — understanding, of course, that there will be missteps and possibly complete meltdowns along the way.

In fact, Big Data is just one component of DX that you need to think about. Your technology infrastructure and investments (including packaged applications, databases, and analytic and BI tools) need to similarly be rationalized and ultimately monetized, to deliver the true value they can bring to DX.

Odds are many components will either be retired or repurposed, and you’ll likely come to the same conclusion as everyone else that your business users are going to be key players in how DX technology solutions get built and used. That means your technology and analytic tools need to allow you the agility and flexibility to prototype and deploy quickly; evolve at the speed of business; and empower people across functions and lines of business to collaborate more than they’ve ever done before.

Beyond mapping out your overarching data, technology and analytic strategies, there are several areas to consider on your DX journey. Over the next three posts, I’ll focus on how to:

  1. Visualize your digital business, not your competitors’
  2. Unleash the knowledge hidden within your most critical assets
  3. Embrace the role and evolution of analytics within your journey

To whet your appetite, check out this short video on the role of AI in making DX-powered decisions.

 

The post Three Keys to Advancing your Digital Transformation appeared first on FICO.


          The Future of AI: Redefining How We Imagine        

FICO 25 years of AI and machine learning logo

To commemorate the silver jubilee of FICO’s use of artificial intelligence and machine learning, we asked FICO employees a question: What does the future of AI look like? The post below is one of the thought-provoking responses, from Sadat Nazrul, an analytic scientist at FICO, working in San Diego.

Looking at the next 20 years, I see us moving well beyond the productivity enhancements AI has brought about so far. With the advent of AI, we will be seeing a renaissance in our own personal lives as well as society as a whole.

Healthcare

Today, our gadgets have the ability to monitor the number of steps we take, the rate of our heart beat, as well as the contents of our sweat. All this rich information allows a team of doctors, engineers and analysts to monitor our well-being and to maintain our peak performance. Similarly, with innovations in genomic sequencing and neural mapping passing FDA trials, we will soon be seeing huge leaps in the field of personalized medicine. AI will help us understand individual physiological needs in order to come up with customized prescriptions and improve our overall health standards.

Cognitive Abilities

People are keen to improve cognition. Who wouldn’t want to remember names and faces better, to be able more quickly to grasp difficult abstract ideas, and to be able to “see connections” better? Who would seriously object to being able to appreciate music at a deeper level?

The value of optimal cognitive functioning is so obvious that to elaborate the point may be unnecessary. Today we express ourselves through art, movies, music, blogs, and a wide range of social media applications. In the field of image recognition, AI can already “see” better than we can by observing far more than the RGB. Virtual Reality allows us to feel as though we have teleported to another world. New idioms of data visualization and Dimensionality Reduction algorithms are always being produced for us to better experience the world around us.

We are constantly trying to enhance our 5 senses to go beyond our human limits. 10 years from now, these innovations, coupled with IoT gadgets, will act as extensions of who we are and help us experience our surroundings more profoundly.

Emotional Intelligence

Just as we enhance our 5 cognitive senses, so too do we enhance our ability to express ourselves and to understand those around us.

Many times, we don’t even know what we want. We strive to connect with those around us in a specific way or consume a particular product, just so we could feel a very unique emotion that we fail to describe. We feel much more than just happiness, sadness, anger, anxiety or fear. Our emotions are a complex combinations of all of the above.

With the innovations in neural mapping, we will better understand who we are as human beings and better understand the myriad emotional states that we can attain. Our complex emotional modes will be better understood as we perform unsupervised learning on brain waves and help find innovative ways to improve our emotional intelligence. This would include both understanding our own emotions and being more sensitive towards those around us.

Perhaps we can unlock new emotions that we have never experienced before. In the right hands, AI can act as our extensions to help us form meaningful bonds with the people we value in our lives.

Experience and Imagination

The effect of AI on our experience and imagination would result from an aggregate of better cognitive abilities and emotional intelligence. The latest innovations in AI may help us unlock new cognitive experiences and emotional states.

Let’s imagine that the modes of experience we have today is represented in space X. 10 years from now, let’s say that the modes of experience are represented in space Y. The space Y will be significantly bigger than space X. This futuristic space of Y may have access to new types of emotions other than our conventional happy, sad and mad. This new space of Y can even allow us to comprehend abstract thoughts that reflect what we wish to express more accurately.

This new space of Y can actually unlock a new world of possibilities that lies beyond our current imagination. The people of the future will think, feel and experience the world at a much richer degree than we can today.

Communication

10 years ago, most of our communication were restricted to phones and emails. Today, we have access to video conferences, Virtual Reality and a wide array of applications on social media. As we enhance our cognitive abilities and emotional intelligence, we can express ourselves through idioms of far greater resolution and lower levels of abstractions.

We already have students from the University of Florida achieving control of drones using nothing but the mind. We even have access to vibrating gaming consoles that take advantage of our sense of touch for making that Mario Kart game that much more realistic. 10 years from now, the way we communicate with each other will be much deeper and more expressive than today. If we are hopeful enough, we might even catch a glimpse of the holograms of Star Wars and telepathic communications of X-Men.

Virtual Realities of today only limit us to our vision and sense of hearing. In the future, Virtual Realities might actually allow us to smell, taste and touch our virtual environment. Along with access to our 5 senses, our emotional reaction to certain situations might be fine-tuned and optimized with the power of AI. This might mean sharing the fear of our main characters on Paranormal Activity, feeling the heartbreak of Emma or being excited about the adventures of Pokemon. All this can be possible as we explore the theatrical arts of smart Virtual Reality consoles.

Information Security

AI allow us to unearth more unstructured data at higher velocity to generate valuable insight. However, the risk of those very sensitive data falling into the wrong hands will also escalate.

Today, cybersecurity is a major concern on everyone’s mind. In fact, 2017 is the year the fingerprint got hacked. With the help of AI, information technology will get more sophisticated in order to protect the things we value in our lives.

It is human nature to want to go beyond our limits and become something much more. Everyone wants to live longer healthy lives, experience more vividly and feel more deeply. Technology is simply a means to achieve that end.

See other FICO posts on artificial intelligence.

The post The Future of AI: Redefining How We Imagine appeared first on FICO.


          How AI will make smartphones much smarter        

The future of the smartphone is rooted in advancements of artificial intelligence and machine learning. Through the wonders of AI, your phone will be able to track, interpret, and respond to patterns and trends that it recognizes as “desirable” or “necessary.” It will organize, match, and learn every single day about who you are and […]


          DNN Hangout - August 2015 - Introducing Aricie’s PortalKeeper Module        

Jean-Sylvain Boige introduces us to PortalKeeper by Aricie

It’s always exciting for me to meet someone in the DNN community that I haven’t met before. This hangout is one of those times. In this hangout, we speak with Jess (or Jean-Sylvain Boige) of Aricie. He’s the CTO of Aricie and the genius mind behind the module for DNN called PortalKeeper. I must admit, I didn’t know anything about PortalKeeper before this hangout, but rest assured… this is going to be a pretty standard tool of mine for current and future projects. If you’ve gotten even the beginnings of a technical bone in your body, you’re sure to geek-out just like I did!

Want to Be on the Show?

We are always looking for new people to be featured on the show. You don’t have to be an “expert” in anything. Just come prepared to chat with us about anything interesting about DNN, no matter how big or small.

Please let me know in the comments or via email if you’d like to be on DNN Hangout.

Next Episode

In our next episode, we’ll be having a very special show. Joe and I will be taking a deep dive into the source code of DNN. When it first began, it was a fairly simple project. Not anymore.

We’re looking for any special guest to join us for this. You don’t have to do anything, except help us introduce you, and ask questions during the hangout.

Join the Hangout

Site of the Month

We didn’t have any sites to show off this month. It was mostly my fault for not promoting it as much as I had in the past. I could have shown some that I have, but I much rather highlight yours. Please let me know if you’d like for me to do a quick segment on one of your sites.

Jean-Sylvain Boige: Introducing the PortalKeeper Module

Show Notes

Events

  • DNNCon 2015 – Unfortunately, there’s no update on this yet (if you want to organize it, please let me know)

Articles, Videos, and Blogs

Extension Updates


          IBM releases Watson Machine Learning for a general audience        

Not content with beating humans at quiz shows, IBM is moving forward with its Watson Machine Learning service. Now generally available after a year’s worth of beta testing, WML promises to address the needs of both data scientists and devs.

The post IBM releases Watson Machine Learning for a general audience appeared first on JAXenter.


          Top 10 Java stories of July        

Ah, summer. The sun is hot, the days are long, and the Java ecosystem keeps on spinning. What did we read last month? Well, it turns out we all really liked stories about the upcoming release of Java 9, machine learning for fun and profit, and the latest updates on Angular.

The post Top 10 Java stories of July appeared first on JAXenter.


          This Photo Book Was Curated And Annotated Using Machine Learning        
Computed Curation used an algorithm to curate and write descriptions for a volume of 207 photographs
          Marketing transakcyjny wyprze dotychczasowe formy promocji?        
Cyfryzacja, rozwój technologiczny, big data oraz machine learning pozwalają coraz efektywniej korzystać z dostępnych informacji i maksymalizować skuteczność komunikacji z klientem. Jednym z rozwiązań, bazujących na jakościowych danych, jest tzw. marketing transakcyjny, który według 3/4 marketingowców zastąpi dotychczasowe formy promocji, reklamy i lojalizacji klientów.
          åˆ©ç”¨ä¸­æ–‡æ•°æ®è·‘Google开源项目word2vec        
http://www.cnblogs.com/hebin/p/3507609.html

一直听说word2vec在处理词与词的相似度的问题上效果十分好,最近自己也上手跑了跑Google开源的代码(https://code.google.com/p/word2vec/)。

1、语料

首先准备数据:采用网上博客上推荐的全网新闻数据(SogouCA),大小为2.1G。 

从ftp上下载数据包SogouCA.tar.gz:
1 wget ftp://ftp.labs.sogou.com/Data/SogouCA/SogouCA.tar.gz --ftp-user=hebin_hit@foxmail.com --ftp-password=4FqLSYdNcrDXvNDi -r

解压数据包:

1 gzip -d SogouCA.tar.gz 2 tar -xvf SogouCA.tar

再将生成的txt文件归并到SogouCA.txt中,取出其中包含content的行并转码,得到语料corpus.txt,大小为2.7G。

1 cat *.txt > SogouCA.txt 2 cat SogouCA.txt | iconv -f gbk -t utf-8 -c | grep "<content>" > corpus.txt

2、分词

用ANSJ对corpus.txt进行分词,得到分词结果resultbig.txt,大小为3.1G。

分词工具ANSJ参见 http://blog.csdn.net/zhaoxinfan/article/details/10403917
在分词工具seg_tool目录下先编译再执行得到分词结果resultbig.txt,内含426221个词,次数总计572308385个。
 åˆ†è¯ç»“果:
  
3、用word2vec工具训练词向量
1 nohup ./word2vec -train resultbig.txt -output vectors.bin -cbow 0 -size 200 -window 5 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1 &

vectors.bin是word2vec处理resultbig.txt后生成的词的向量文件,在实验室的服务器上训练了1个半小时。

4、分析
4.1 计算相似的词:
1 ./distance vectors.bin

 ./distance可以看成计算词与词之间的距离,把词看成向量空间上的一个点,distance看成向量空间上点与点的距离。

下面是一些例子: 

4.2 潜在的语言学规律

在对demo-analogy.sh修改后得到下面几个例子:
法国的首都是巴黎,英国的首都是伦敦, vector("法国") - vector("巴黎) + vector("英国") --> vector("伦敦")"

4.3 聚类

将经过分词后的语料resultbig.txt中的词聚类并按照类别排序:

1 nohup ./word2vec -train resultbig.txt -output classes.txt -cbow 0 -size 200 -window 5 -negative 0 -hs 1 -sample 1e-3 -threads 12 -classes 500  & 2 sort classes.txt -k 2 -n > classes_sorted_sogouca.txt  

例如:

4.4 短语分析

先利用经过分词的语料resultbig.txt中得出包含词和短语的文件sogouca_phrase.txt,再训练该文件中词与短语的向量表示。

1 ./word2phrase -train resultbig.txt -output sogouca_phrase.txt -threshold 500 -debug 2 2 ./word2vec -train sogouca_phrase.txt -output vectors_sogouca_phrase.bin -cbow 0 -size 300 -window 10 -negative 0 -hs 1 -sample 1e-3 -threads 12 -binary 1

下面是几个计算相似度的例子:

5、参考链接:

1. word2vec:Tool for computing continuous distributed representations of words,https://code.google.com/p/word2vec/

2. 用中文把玩Google开源的Deep-Learning项目word2vec,http://www.cnblogs.com/wowarsenal/p/3293586.html

3. 利用word2vec对关键词进行聚类,http://blog.csdn.net/zhaoxinfan/article/details/11069485

6、后续准备仔细阅读的文献:

[1] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013.
[2] Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. Distributed Representations of Words and Phrases and their Compositionality. In Proceedings of NIPS, 2013.
[3] Tomas Mikolov, Wen-tau Yih, and Geoffrey Zweig. Linguistic Regularities in Continuous Space Word Representations. In Proceedings of NAACL HLT, 2013.

[4] Collobert R, Weston J, Bottou L, et al. Natural language processing (almost) from scratch[J]. The Journal of Machine Learning Research, 2011, 12: 2493-2537.

 



SIMONE 2016-01-13 13:49 发表评论

          My Current Predictions for Thinking Machines        

I’ve been thinking a lot about what I called “Getting Better at Getting Better” in my book. It’s the idea of accelerating machine intelligence, where computers aren’t just getting better at solving problems, but the pace at which they get better increases drastically. I think this comes in two forms: Improved machine learning that improves as we provide...

__

I do a weekly show called Unsupervised Learning, where I curate the most interesting stories in infosec, technology, and humans, and talk about why they matter. You can subscribe here.


          More Spring reading        

Hi folks, here's a nice, juicy reading list for that rainy Saturday afternoon. Well... it has stopped raining here but that should not stop you from reading!

Java

Slightly more hard core Java

Java in the future

A little bit of non-Java

Kubernetes

Systems, data stores and more

Time series

Some fun stuff

Until next time! Ashwin.

          Summer 2016 tech reading        

Hi there! Summer is here and almost gone. So here's a gigantic list of my favorite, recent articles, which I should've shared sooner.

Java

Other languages

Reactive programming

Persistent data structures

CRDT

Data

Systems and other computer science-y stuff

Fun/General

Until next time! Ashwin.

          Chatbot AL-ML by robin9111        
Hi, I am after a Chatbox that can use AL & ML to learn and give varied answers on anything I ask. I am not after a static answer and question bot like you see on the web. I want something higher end... (Budget: $250 - $750 USD, Jobs: Algorithm, Artificial Intelligence, Data Mining, Machine Learning, Software Architecture)
          April 2016: Machine Learning        
This month on R.Science we’re talking about machine learning- a field of science and a powerful technology that allows machines...
          Google says its custom machine learning chips are often 15-30x faster than GPUs and CPUs        
none
          How Artificial Intelligence Helped Us Predict Forest Loss in the Democratic Republic of the Congo        

How Artificial Intelligence Helped Us Predict Forest Loss in the Democratic Republic of the Congo

Compared with the planet's other large tracts of tropical forests, the forests in the Democratic Republic of Congo (DRC) have remained relatively intact — though that may soon be changing. Driven by factors such as shifting cultivation (slash-and-burn agriculture), fuelwood demand, logging, mining, infrastructure development, population growth and migration, rates of forest loss in the African country have doubled over the past 15 years.

New WRI spatial modeling research sheds light on the uncertain future of the DRC's forests. Based on an application of machine learning, the study focuses on a specific set of the DRC's most intact forested areas identified as containing critical biodiversity habitat; it predicts that without intervention, at least 332,200 hectares (820,884 acres) of these critical forests could be lost by 2025. The collective size of this predicted forest loss — an area the size of Luxembourg within a country the size of Western Europe — may be small, yet millions of people rely on these forests for food, shelter and medicine. This underscores an urgent need to use this study to inform smart land-use decisions in the DRC.

The Study: Where and How

Within the DRC, our research focuses on the landscapes prioritized by the Central Africa Regional Program for the Environment (CARPE), a U.S. Agency for International Development (USAID)-funded program implemented by WRI and other partners. As the program was established to maintain the forests and biodiversity of the Congo Basin, its six DRC landscapes were chosen because they contain forests that are important for biodiversity, as well as the livelihoods of millions of people. They also include a range of land-use types such as settlements, zones for extracting timber and minerals, areas of subsistence agriculture and protected areas.

This study used spatial modeling software and an artificial neural network architecture, based on the functions of a human brain, to map the links between past forest loss and its drivers in these landscapes, whether tied to biophysical factors (elevation, slope and precipitation), accessibility (distance from roads, settlements, rivers, conflict and shifting cultivation) or land management (forest concession and protected areas). As it's fed data on how landscapes have changed over the past, the model 'learns' and adapts until it arrives at the most accurate linkages between past loss and drivers. The model then ranks the influence of various forest loss drivers in the CARPE landscapes and produces a map showing areas at high risk for forest loss.

What Major Factors Influenced Forest Loss?

Model results show that human presence has had a significant influence on forest loss in the landscapes. Presence of shifting cultivation — a common land-use practice in the region, as most people rely on subsistence farming to feed their families — had the highest influence on predicting forest loss. Distance from roads was also significant, allowing access into forests that would otherwise be too remote and difficult to traverse for fuelwood and timber extraction. Average precipitation levels had an important influence on forest loss as well, likely due to their effect on agriculture and the probability of forest fires.

<p>Area cleared for agriculture near Yangambi, DRC. Flickr/CIFOR</p>

Area cleared for agriculture near Yangambi, DRC. Flickr/CIFOR

The majority of loss is expected to occur in the Ituri-Epulu-Aru (34 percent) and Lac-Télé-Lac Tumba (25 percent) landscapes, with loss concentrating near farmland, along roads and near settlements in all landscapes. In some landscapes, protected areas are particularly vulnerable; for example, protected areas within Salonga-Lukenie-Sankuru and Lac Télé-Lac Tumba landscapes are projected to lose 11,700 hectares (28,911 acres) of forest.

How We Can Use Spatial Modeling

The government of the DRC is currently pursuing an ambitious program of land use planning reform as part of its national REDD+ process. The country has also made important commitments in the forest sector to set aside 17 percent of forest area as protected, reduce deforestation and restore 8 million hectares of forest land (19.8 million acres). With information about the drivers of deforestation in hand, authorities can proactively make land-use decisions that shift development pressure away from high-value forests and identify opportunities to restore degraded land to address the food and energy needs of a growing population.

By revealing where future forest loss is most likely to occur, our research can inform land-use decisions and prioritize conservation efforts. The impact of future roads, settlements or other infrastructure changes can be assessed to reduce forest loss, particularly in protected or other high-value areas. And by seeing where pressure on forests is likely to shift due to increased protection in one area, decision-makers can also better understand the implications of conservation efforts such as increased law enforcement, new protected areas or expanding community forest management.

In the DRC, WRI is working with the government, NGOs and civil society partners to ensure that development of new land-use policies and plans integrate high-quality information on forests and biodiversity, including spatial models and scenarios. As part of these efforts, we will continue to build capacity of decision-makers, including provincial authorities and park managers, to understand and implement the findings of this study. Its results should inform a decision-making framework that ensures that land is developed in a way that considers economic, social and environmental impacts. Spatial data and scenario planning must be integrated with land-use decisions — and donors must support the creation of the spatial data infrastructure needed to make this possible.

The Central Africa Regional Program for the Environment (CARPE) is administered by the U.S. Agency for International Development. CARPE is a US Government long-term effort to sustain the ecological integrity of the second largest tropical humid forest ecosystem in the world – The Congo Basin. CARPE has a rich network of implementing partners, including WWF, WCS, WRI, AWF, regional institutions such as COMIFAC, local NGOs, universities and federal agencies including the Department of Interior, the U.S. Department of Agriculture (U.S. Forest Service and the Foreign Agricultural Service), and the National Aeronautics and Space Administration (NASA).


          By: Man Ray        
Picasa's implementation is well-crafted, permitting incremental tagging. It's machine learning and heuristics make it a pleasure to use -- and accurate enough to greatly reduce the time needed to tag people in my database. Adobe needs to do the same. My database is much less useful without this kind of meta information and too large to do without some sort of assistance. Adobe get your act together!
          Google "Perspective" Machine Learning To Hide Toxic Comments        
none
          Pedro Domingos on Artificial Intelligence        

On this episode, I am so happy to have Pedro Domingos who is a professor at the University of Washington.

He’s the leading researcher in machine learning and recently wrote an amazing book called The Master Algorithm. In this conversation we explore the sources of knowledge, the five major schools of machine learning, why white collar jobs are easier to replace than blue collar jobs, machine wars, self-driving cars and so much more.


          What is This Blog?        
Primarily, this is a blog about statistics, machine learning, computing, and other scientific areas where I have some expertise. Many posts will be quite technical, but some will be more widely accessible. I will also occasionally post on photography and other subjects that I have an amateur interest in, and on some societal topics, particularly […]
          Sr. Machine Learning Engineer (life-saving startup, up to $200k+)        

          Wyniki wyszukiwania Google w 2016 – XIX Semcamp Cezary Glijer        
W dniu 7 grudnia odbył się już XIX Semcamp. Miałem przyjemność wystapić z prezentacją o przyszłości wyników wyszukiwania Google. Opowiadałem o: Google Answer Box, Knowledge Graph, Entity, sztuczna inteligencja, Machine Learning, RankBrain. Zapraszam 🙂

          Stuff The Internet Says On Scalability For July 28th, 2017s        

Hey, it's HighScalability time:

 

Jackson Pollock painting? Cortical column? Nope, it's a 2 trillion particle cosmological simulation using 4000+ GPUs. (paper, Joachim Stadel, UZH)

If you like this sort of Stuff then please support me on Patreon.

 

  • 1.8x: faster code on iPad MacBook Pro; 1 billion: WhatsApp daily active users; 100 milliamps: heart stopping current; $25m: surprisingly low take from ransomware; 2,700x: improvement in throughput with TCP BBR; 620: Uber locations; $35.5 billion: Facebook's cash hoard; 2 billion: Facebook monthly active users; #1: Apple is the world's most profitable [legal] company; 500,000x: return on destroying an arms depot with a drone; 

  • Quotable Quotes:
    • Alasdair Allan: Jeff Bezos’ statement that “there’s not that much interesting about CubeSats” may well turn out to be the twenty first century’s “nobody needs more than 640kb.”
    • @hardmaru: Decoding the Enigma with RNNs. They trained a LSTM with 3000 hidden units to decode ciphertext with 96%+ accuracy. 
    • @tj_waldorf: Morningstar achieved 97% cost reduction by moving to AWS. #AWSSummit Chicago
    • Ed Sperling: Moore’s Law is alive and well, but it is no longer the only approach. And depending on the market or slice of a market, it may no longer be the best approach.
    • @asymco: With the end of Shuffle and Nano iPods Apple now sells only Unix-enabled products. Amazing how far that Bell Labs invention has come.
    • @peteskomoroch: 2017: RAM is the new Hadoop
    • Carlo Pescio: What if focusing on the problem domain, while still understanding the machine that will execute your code, could improve maintainability and collaterally speed up execution by a factor of over 100x compared to popular hipster code?
    • @stevesi: Something ppl forget: moving products to cloud, margins go down due to costs to operate scale services—costs move from Customer to vendor.
    • @brianalvey: The most popular software for writing fiction isn't Word. It's Excel.
    • @pczarkowski: How to make a monolithic app cloud native: 1) run it in a docker 2) change the url from .com to .io
    • @tj_waldorf: Morningstar achieved 97% cost reduction by moving to AWS. #AWSSummit Chicago
    • drinkzima: There is a huge general misunderstanding in the profitability of directing hotel bookings vs flight bookings or other types of travel consumables. Rate parity and high commission rates mean that directing hotel rooms is hugely profitable and Expedia (hotels.com, trivago, expedia) and Priceline (booking.com) operate as a duopoly in most markets. They are both marketing machines that turn brand + paid traffic into highly profitable room nights.
    • Animats: This is a classic problem with AI researchers. Somebody gets a good result, and then they start thinking strong human-level AI is right around the corner. AI went through this with search, planning, the General Problem Solver, perceptrons, the first generation of neural networks, and expert systems. Then came the "AI winter", late 1980s to early 2000s, when almost all the AI startups went bust. We're seeing some of it again in the machine learning / deep neural net era.
    • Charity Majors: So no, ops isn't going anywhere. It just doesn't look like it used to. Soon it might even look like a software engineer.
    • @mthenw: As long as I need to pay for idle it’s not “serverless”. Pricing is different because in Lambda you pay for invocation not for the runtime.
    • Kelly Shortridge: The goal is to make the attacker uncertain of your defensive environment and profile. So you really want to mess with their ability to profile where their target is
    • @CompSciFact: 'About 1,000 instructions is a reasonable upper limit for the complexity of problems now envisioned.' -- John von Neumann, 1946
    • hn_throwaway_99: Few barriers to entry, really?? Sorry, but this sounds a bit like an inexperienced developer saying "Hey, I could build most of Facebook's functionality in 2 weeks." Booking.com is THE largest spender of advertising on Google. They have giant teams that A/B test the living shite out of every pixel on their screens, and huge teams of data scientists squeezing out every last bit of optimization on their site. It's a huge barrier to entry. 
    • callahad: It's real [performance improvements]. We've [Firefox] landed enormous performance improvements this year, including migrating most Firefox users to a full multi-process architecture, as well as integrating parts of the Servo parallel browser engine project into Firefox. There are still many improvements yet-to-land, but in most cases we're on track for Firefox 57 in November.
    • Samer Buna: One important threat that GraphQL makes easier is resource exhaustion attacks (AKA Denial of Service attacks). A GraphQL server can be attacked with overly complex queries that will consume all the resources of the server.
    • wheaties: This is stupid. Really. Here we are in a world where the companies that own the assets (you know, the things that cost a lot of money) are worth less than the things that don't own anything. This doesn't seem "right" or "fair" in the sense that Priceline should be a middleman, unable to exercise any or all pricing power because it does not control the assets producing the revenue. I wonder how long this can last?
    • platz: Apparently deep-learning and algae are the same thing.
    • @CompSciFact: "If you don't run experiments before you start designing a new system, your entire system will be an experiment." -- Mike Williams
    • Scott Aaronson: our laws of physics are structured in such a way that even pure information often has “nowhere to hide”: if the bits are there at all in the abstract machinery of the world, then they’re forced to pipe up and have a measurable effect. 
    • The Internet said many more interesting things this week. To read them all please click through to the full article.

  • Cool interview with Margaret Hamilton--NASA's First Software Engineer--on Makers. Programmers, you'll love this. One of the stories she tells is how her daughter was playing around and selected the prelaunch program during flight. That crashed the simulator. So like a good programmer she wanted to prevent this from happening. She tried to get a protection put in because an astronaut could actually do this during flight. Management would certainly allow this, right? She was denied. They said astronauts are trained never to make a mistake so it could never happen. Eventually she won the argument and was able to add code to protect against human error. So little has changed :-)

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...


          Stuff The Internet Says On Scalability For July 21st, 2017        

Hey, it's HighScalability time:

Afraid of AI? Fire ants have sticky pads so they can form rafts, build towers, cross streams, & order takeout. We can CRISPR these guys to fight Skynet. (video, video, paper)

If you like this sort of Stuff then please support me on Patreon.

 

  • 222x: Bitcoin less efficient than a physical system of metal coins and paper/fabric/plastic; #1: Python use amongst Spectrum readers; 3x: time spent in apps that don't make us happy; 1 million: DigitalOcean users; 11.6 million: barrels of oil a day saved via tech and BigData; 200,000: cores on Cray super computer;$200B: games software/hardware revenue by 2021; $3K: for 50 Teraflops AMD Vega Deep Learning Box; 24.4 Gigawatts: China New Solar In First Half Of 2017; 

  • Quotable Quotes:
    • sidlls: I think instead there is a category error being made: that CS is an appropriate degree (on its own) to become a software engineer. It's like suggesting a BS in Physics qualifies somebody to work as an engineer building a satellite.
    • Elon Musk: AI is a fundamental existential risk for human civilization, and I don’t think people fully appreciate that
    • Mike Elgan: Thanks to machine learning, it's now possible to create a million different sensors in software using only one actual sensor -- the camera.
    • Amin Vahdat (Google): The Internet is no longer about just finding a path, any path, between a pair of servers, but actually taking advantage of the rich connectivity to deliver the highest levels of availability, the best performance, the lowest latency. Knowing this, how you would design protocols is now qualitatively shifted away from pairwise decisions to more global views.
    • naasking: You overestimate AI. Incompleteness is everywhere in CS. Overcoming these limitations is not trivial at all.
    • 451: Research believes serverless is poised to undergo a round of price cutting this year.
    • Nicholas Bloom: We found massive, massive improvement in performance—a 13% improvement in performance from people working at home
    • @CoolSWEng: "A Java new operation almost guarantees a cache miss. Get rid of them and you'll get C-like performance." - @cliff_click #jcrete
    • DarkNetMarkets: We're literally funding our own investigation. 
    • Tristan Harris: By shaping the menus we pick from, technology hijacks the way we perceive our choices and replaces them with new ones. But the closer we pay attention to the options we’re given, the more we’ll notice when they don’t actually align with our true needs.
    • xvaier: If I have one thing to tell anyone who is looking for business ideas to try out their new programming skills on, I strongly suggest taking the time to learn as much as possible about the people to whom you want to provide a solution, then recruiting one of them to help you build it, lest you become another project that solves a non-issue beautifully.
    • @sebgoa: Folks, there were schedulers before kubernetes. Let's get back down to earth quickly
    • Mark Shead: A finite state machine is a mathematical abstraction used to design algorithms. In simple terms, a state machine will read a series of inputs. When it reads an input it will switch to a different state. Each state specifies which state to switch for a given input. This sounds complicated but it is really quite simple.
    • xantrel: I started a small business that started to grow, I thought I had to migrate to AWS and increase my cost by 5xs eventually, but so far Digital Ocean with their hosted products and block storage has handled the load amazingly well.
    • danluu: when I’m asked to look at a cache related performance bug, it’s usually due to the kind of thing we just talked about: conflict misses that prevent us from using our full cache effectively6. This isn’t the only way for that to happen – bank conflicts and and false dependencies are also common problems
    • Charles Hoskinson: People say ICOs (Initial Coin Offering) are great for Ethereum because, look at the price, but it’s a ticking time-bomb. There’s an over-tokenization of things as companies are issuing tokens when the same tasks can be achieved with existing blockchains. People are blinded by fast and easy money.
    • Charles Schwab: There don't seem to be any classic bubbles near bursting at the moment—at least not among the ones most commonly referenced as potential candidates.
    • Sertac Karaman: We are finding that this new approach to programming robots, which involves thinking about hardware and algorithms jointly, is key to scaling them down.
    • Michael Elling: When do people wake up and say that we’ve moved full circle back to something that looks like the hierarchy of the old PSTN? Just like the circularity of processing, no?
    • Benedict Evans: Content and access to content was a strategic lever for technology. I’m not sure how much this is true anymore.  Music and books don’t matter much to tech anymore, and TV probably won’t matter much either. 
    • SeaChangeViaExascaleOnDown: Currently systems are still based around mostly separately packaged processor elements(CPUs, GPUs, and other) processors but there will be an evolution towards putting all these separate processors on MCMs or Silicon Interposers, with silicon interposers able to have the maximum amount of parallel traces(And added active circuitry) over any other technology.
    • BoiledCabbage: Call me naive, but am I the only one who looks at mining as one of the worst inventions for consuming energy possible?
    • Amin Vahdat (Google):  Putting it differently, a lot of software has been written to assume slow networks. That means if you make the network a lot faster, in many cases the software can’t take advantage of it because the software becomes the bottleneck.

  • Dropbox has 1.3 million lines of Go code, 500 million users, 500 petabytes of user data, 200,000 business customers, and a multi-exabyte Go storage system. Go Reliability and Durability at Dropbox. They use it for: RAT: rate limiting and throttling; HAT: memcached replacement; AFS: file system to replace global Zookeeper; Edgestore: distributed database; Bolt: for messaging; DBmanager: for automation and monitoring of Dropbox’s 6,000+ databases; “Jetstream”, “Telescope”, block routing, and many more. The good: Go is productive, easy to write and consume services, good standard library, good debugging tools. The less good: dealing with race conditions.

  • Professor Jordi Puig-Suari talks about the invention of CubeSat on embedded.fm. 195: A BUNCH OF SPUTNIKS. Fascinating story of how thinking different created a new satellite industry. The project wasn't on anyone's technology roadmap, nobody knew they needed it, it just happened. A bunch of really bright students, in a highly constrained environment, didn't have enough resources to do anything interesting, so they couldn't build spacecraft conventionally. Not knowing what you're doing is an advantage in highly innovative environments. The students took more risk and eliminated redundancies. One battery. One radio. Taking a risk that things can go wrong. They looked for the highest performance components they could find, these were commercial off the shelf components that when launched into space actually worked. The mainline space industry couldn't take these sort of risks. Industry started paying attention because the higher performing, lower cost components, even with the higher risk, changed the value proposition completely. You can make it up with numbers. You can launch 50 satellites for the cost of one traditional satellite. Sound familiar? Cloud computing is based on this same insight. Modern datacenters have been created on commodity parts and how low cost miniaturized parts driven by smartphones have created whole new industries. CubeSats' had a standard size, so launch vehicles could standardize also, it didn't matter where the satellites came from, they could be launched. Sound familiar? This is the modularization of the satellite launching, the same force that drives all mass commercialization. Now the same ideas are being applied to bigger and bigger spacecraft. It's now a vibrant industry. Learning happens more quickly because they get to fly more. Sound familiar? Agile, iterative software development is the dominant methodology today. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...


          Stuff The Internet Says On Scalability For July 7th, 2017        

Hey, it's HighScalability time:

 

 

What's real these days? I was at Lascaux II, an exact replica of Lascaux. I was deeply, deeply moved. Was this an authentic experience? A question we'll ask often in VR I think.

If you like this sort of Stuff then please support me on Patreon.
  • $400k: cost of yearly fake news campaign; $50,000: cost to discredit a journalist; 100 Gbps: SSDP DDoS amplification attack; $5.97BN: wild guess on cost of running Facebook on AWS; 2 billion: Facebook users; 80%: Spotify backend services in production run as containers; $60B: AR market by 2021; 10.4%: AMD market share taken from Intel; 5 days: MIT drone flight time; $1 trillion: Apple iOS revenues; 35%-144%: reduction in image sizes; 10 petabytes: Ancestry.com data stored; 1 trillion: photos taken on iPhone each year; $70B: Apple App Store payout to developers; 355: pages in Internet Trends 2017 report; 14: people needed to make 500,000 tons of steel; 25%: reduced server-rendering time with Node 8; 50-70%: of messages Gmail receives are spam; 8,000: bugs found in pacemaker code; 

  • Quotable Quotes:
    • Vladimir Putin: We must take into account the plans and directions of development of the armed forces of other countries… Our responses must be based on intellectual superiority, they will be asymmetric, and less expensive.
    • @swardley: What most fail to realise is that the Chinese corporate corpus has devoured western business thinking and gone beyond it.
    • @discostu105: I am a 10X developer. Everything I do takes ten times as long as I thought.
    • DINKDINK: You grossly underestimate the hashing capacity of the bitcoin network. The hashing capacity, at time of posting, is approximately 5,000,000,000 Gigahashes/second[1]. Spot measurement of the hashing capacity of an EC2 instance is 0.4 Gigahashes/second[2]. You would need 12 BILLION EC2 instances to 51% attack the bitcoin network.[3] Using EC2 to attack the network is impractical and inefficient.
    • danielsamuels && 19eightyfour~ Machiavelli's Guide to PaaS: Keep your friends close, and your competitors hosted.
    • Paul Buchheit:  I wrote the the first version of Gmail in a day!
    • @herminghaus: If you don’t care about latency, ship a 20ft intermodal container full of 32GB micro-SD cards across the globe. It’s a terabyte per second.
    • @cstross: Okay, so now the Russian defense industry is advertising war-in-a-can (multimodal freight containerized missiles):
    • Dennett~ you don't need comprehension to achieve competence.
    • @michellebrush~ Schema are APIs. @gwenshap #qconnyc
    • Stacy Mitchell: Amazon sells more clothing, electronics, toys, and books than any other company. Last year, Amazon captured nearly $1 of every $2 Americans spent online. As recently as 2015, most people looking to buy something online started at a search engine. Today, a majority go straight to Amazon.
    • Xcelerate: I have noticed that Azure does have a few powerful features that AWS and GCP lack, most notably InfiniBand (fast interconnects), which I have needed on more than one occasion for HPC tasks. In fact, 4x16 core instances on Azure are currently faster at performing molecular dynamics simulations than 1x"64 core" instance on GCP. But the cost is extremely high, and I still haven't found a good cloud platform for short, high intensity HPC tasks.
    • jjeaff: I took about 5 sites from a $50 a month shared cPanel plan that included a few WordPress blogs and some custom sites and put them on a $3 a month scaleway instance and haven't had a bit of trouble.
    • @discordianfish: GCP's Pub/Sub is really priced by GB? And 10GB/free/month? What's the catch?
    • Amazon: This moves beyond the current paradigm of typing search keywords in a box and navigating a website. Instead, discovery should be like talking with a friend who knows you, knows what you like, works with you at every step, and anticipates your needs. This is a vision where intelligence is everywhere. Every interaction should reflect who you are and what you like, and help you find what other people like you have already discovered. 
    • @CloudifySource: Lambda is always 100% busy - @adrianco #awasummit #telaviv #serverless
    • @codinghorror: Funny how Android sites have internalized this "only multi core scores now matter" narrative with 1/2 the CPU speed of iOS hardware
    • @sheeshee: deleted all home directories because no separation of "dev" & "production". almost ran a billion euro site into the ground with a bad loop.
    • We have quotes the likes of which even God has never seen! Please click through to ride all of them.

  • The Not Hotdog app on Silicon Valley may be a bit silly, but the story of how they built the real app is one of the best how-tos on building a machine learning app you'll ever read. How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native. The initial app was built in a weekend using Google Cloud Platform’s Vision API, and React Native. The final version took months of refinement. â€ŠGoogle Cloud’s Vision API was dropped because its accuracy in recognizing hotdogs was only so-so; it was slow because of the network hit; it cost too much. They ended up using Keras, a deep learning library that provides nicer, easier-to-use abstractions on top of TensorFlow. They used on SqueezeNet due to its explicit positioning as a solution for embedded deep learning. SqueezeNet used only 1.25 million parameters which made training much faster and reduced resource usage on the device. What would they change? timanglade: Honestly I think the biggest gains would be to go back to a beefier, pre-trained architecture like Inception, and see if I can quantize it to a size that’s manageable, especially if paired with CoreML on device. You’d get the accuracy that comes from big models, but in a package that runs well on mobile. And this is really cool: The last production trick we used was to leverage CodePush and Apple’s relatively permissive terms of service, to live-inject new versions of our neural networks after submission to the app store. 

  • And the winner is: all of us. Serverless Hosting Comparison: Lambda: Unicorn: $20,830.83. Heavy: $120.16. Medium: $4.55. Light: $0.00; Azure Functions: Unicorn: $19,993.60. Heavy: $115.40. Moderate: $3.60. Light: $0.00; Cloud Functions: Unicorn: $23,321.20. Heavy: $138.95. Moderate: $9.76. Light: $0.00; OpenWhisk: Unicorn: $21,243.20. Heavy: $120.70. Medium: $3.83. Light: $0.00; Fission.io: depends on the cost of running your managed Kubernetes cloud. 

  • Minds are algorithms made physical. Seeds May Use Tiny “Brains” to Decide When to Germinate: The seed has two hormones: abscisic acid (ABA), which sends the signal to stay dormant, and gibberellin (GA), which initiates germination. The push and pull between those two hormones helps the seed determine just the right time to start growing...According to Ghose, some 3,000 to 4,000 cells make up the Arabidopsis seeds...It turned out that the hormones clustered in two sections of cells near the tip of the seed—a region the researchers propose make up the “brain.” The two clumps of cells produce the hormones which they send as signals between each other. When ABA, produced by one clump, is the dominate hormone in this decision center, the seed stays dormant. But as GA increases, the “brain” begins telling the seed it’s time to sprout...This splitting of the command center helps the seed make more accurate decisions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...


          Machine Learning Foundations Course        
none
          The Fireside Chat: PR Metrics with Rebekah Iliff        

On this month's Fireside Chat, we visit with Rebekah Iliff, chief strategy officer at AirPR, about PR metrics, ranch vacations, machine learning, and more

The post The Fireside Chat: PR Metrics with Rebekah Iliff appeared first on Spin Sucks.

      

          How to Measure PR with PRTech        

What PR work makes sense to automate? Should you be scared of machine learning and data science? Rebekah Iliff says we should embrace the technology that can turn us into our super selves. She explains

The post How to Measure PR with PRTech appeared first on Spin Sucks.

      

           Steady-state and transient operation discrimination by Variational Bayesian Gaussian Mixture Models         
Zhang, Yu and Bingham, Chris and Gallimore, Michael and Chen, Jun (2013) Steady-state and transient operation discrimination by Variational Bayesian Gaussian Mixture Models. In: IEEE International Workshop on Machine Learning for Signal Processing (MLSP 2013), 22-25 September 2013, Southampton, UK.
          Latinoware 2014 aí vamos nós!        

Introdução

Após 5 anos volto a Latinoware, evento da comunidade de Software Livre que ocorre em Foz do Iguaçu - Paraná - Brasil.

Além das conexões pessoais, trocas de chaves pgp, desvirtualizações de amigos virtuais, chops e etc… tem uma extensa e rica programação. Assim, para minha organização pessoal, listo abaixo as palestras ou oficinas que pretendo participar. Se você estiver por lá, nestes horários, poderemos compartilhar as mesmas coordenadas de espaço-tempo :-)

O que pretendo participar/assistir/comparecer

A programação completa (com sinopse de cada palestra/oficina/keynote) pode ser vista aqui.

15/10/2014

  • 10h - 11h - GNU/Linux - It is not 1984 (or 1969) anymore - Jon “Maddog” Hall
  • SIMULANDO FENÔMENOS COM O GEOGEBRA - Marcela Martins Pereira e Eduardo Antônio Soares Júnior
  • 12h - 13h - Espaços abertos colaborativos Guilherme Guerra
  • 13h - 14h - (comer alguma coisa) e tentar me dividir entre: O analfabetismo tecnológico e a formação dos professores Antonio Carlos C. Marques e Internet das Coisas: Criando APIs para o mundo real com Raspberry Pi e Python Pedro Henrique Kopper
  • 14h -16h - Abertura Oficial da Latinoware
  • 16h - 17h - Edição de vídeos na prática com kdenlive Carlos Cartola
  • 17h - 18h - red#matrix, muito mais que uma mídia social. Frederico (aracnus) Gonçalves Guimarães

16/10/2014

  • 10h - 11h - Direitos autorais e os cuidados ao utilizar serviços “da nuvem” e “gratuitos” para construir objetos educacionais Márcio de Araújo Benedito
  • .
  • 11h - 12h - Colaboração e Ferramentas Livres: possibilidades de contra-hegemonias na Escola. Sergio F. Lima
  • 12h - 13h - Professor Livre! O uso do software livre nas licenciaturas. Wendell Bento Geraldes
  • 13h - 14h - (comer alguma coisa) e Padrões abertos de documentação - ODF. Fa Conti
  • 14h -15h - Bitcoin, o futuro do dinheiro é open source (e livre). Kemel Zaidan e Plataforma Open Hardware para Robótica. Thalis Antunes De Souza e Thomás Antunes de Souza
  • 15h - 16h - Mozilla e Educação, como estamos revolucionando o ensino de habilidades digitais. Marcus Saad
  • 16h - 17h - Arduino Uno x MSP 430. Raphael Pereira AlkmimYuri Adan Gonçalves Cordovil
  • 17h - 18h - Inclusão de PCDs na Educação - Com Software Livre é Possível. Marcos Silva Vieira

17/10/2014

  • 10h - 12h - Presença digital: não basta estar lá, tem que participar. Frederico (aracnus) Gonçalves Guimarães Será um “mão na massa” :-)
  • .
  • 12h - 13h - Acho que vou almoçar :-)
  • 13h - 14h - Educação e tecnologia com recursos livres. Marcos Egito
  • 14h -14:15h - Foto oficial do Evento
  • 14:15h - 15:15h - Introdução ao Latex. Ole Peter Smith
  • 15:15h - 16:15h - abnTeX2 e LaTeX: normas “absurdas” e documentos elegantes. Lauro César
  • 16:15h - 17:15h - Data Science / Big Data / Machine Learning E Software Livre. Eduardo Maçan

Se você estiver por lá, faça contato!


          AI, analytics accelerate pace of digital workplace transformation | Saudi Gazette - Business         
"New research which examined how organisations are evolving from a traditional office environment to a digital workplace revealed that gaining competitive advantage and improving business process are among the top goals of their digital transformation strategy" inform Saudi Gazette.



This is according to 40% of 800 organizations in 15 countries on five continents that were interviewed for Dimension Data’s Digital Workplace Report: Transforming Your Business which was published recently.

Another insight in the Report is that digital transformation is not just about adopting the technologies of the past: 62% of research participants expect to have technology such as virtual advisors in their organizations within the next two years. In addition, 58% expect to start actively investing in technology that powers virtual advisors in the next two years.

Photo: Mechelle Buys Du Plessis
“In a new Digital Transformation in the Workplace report published by Dimension Data last month, the research revealed that countries in the GCC and the Middle East are rapidly divesting from oil driven economies into digital, smart, mobile, and data enabled services driven economies. Migrating their workforces into transformative work environments is critical to ensure success in this journey," said Mechelle Buys Du Plessis, Managing Director, Dimension Data, Middle East.

Today, the digital workplace is no longer just made up of managers and those managed; co-workers collaborating with one another to complete projects; and employees interacting with customers and partners. It’s increasingly populated by ‘virtual employees’ who do not exist in a physical sense, but nonetheless play an important role in the organization.

While artificial intelligence (AI) technology is still in its infancy, it is sufficiently advanced to be working its way into companies in the form of virtual assistants, and, in certain industries such as banking, virtual tellers and virtual advisors. Manifested as bots embedded into specific applications, virtual assistants draw on AI engines and machine learning technology to respond to basic queries.
Read more...

Source: Saudi Gazette

          Are Aussie businesses still struggling with digital strategies? | ARNnet        
Photo: Hafizad Osman
"A new study by Pure Storage suggests they might be" reports Hafizah Osman (ARN).

Photo: ARNnet

It is no surprise that digital transformation is rapidly being adopted across businesses in Australia and New Zealand (A/NZ), especially with emerging technologies such as the Internet of Things (IoT), artificial intelligence (AI) and machine learning leading the charge.

But a closer look into the strategies behind digital show business uncertainties, according to findings of a study by Pure Storage.

In the study, Pure Storage revealed that digital transformation is reaching a tipping point across the region, with 57 per cent of businesses across A/NZ, and 46 per cent of businesses across the Asia Pacific and Japan region, now deriving more than half their revenue from digital streams.

The independent survey which polled more than 3,000 organisations in Asia-Pacific and Japan and 500 in A/NZ, also found that 52 per cent of businesses in A/NZ are looking to digital services to drive faster innovation, while 44 per cent said digital services can help them determine new potential business models.

But against this digital gold rush, businesses across A/NZ are still unsure about the optimal IT strategy that underpins their move to digital, and who is responsible for it, it stated.

Specifically, companies in A/NZ are on the fence about cloud – 50 per cent of businesses polled said they plan to move their business critical workloads to public cloud in the next 18 to 24 months, with the rest planning to adopt private cloud solutions.

In addition, it found that 34 per cent of A/NZ enterprises have moved workloads from public cloud back to on-premises, with the biggest concern against public cloud use being security (57 per cent of respondents). 

Pure Storage A/NZ regional vice-president, Mike Sakalas, said local enterprises are learning that if they want to build out a new class of web-scale applications and leverage the latest in predictive analytics, AI, and machine learning, they need the right data strategy and platform.

“Australian and New Zealand businesses have faced a colossal shift in infrastructure in light of digital transformation over the last few years. 
Read more...

Source: ARNnet

          AI, analytics accelerate pace of digital workplace transformation | Digital News Asia        
  • The number one barrier to successful adoption of new workstyles is IT issues
  • Business leaders and CIOs are switched on to the importance of mobility in the digital workplace

Photo: Digital News Asia

"NEW research which examined how organisations are evolving from a traditional office environment to a digital workplace reveals that gaining competitive advantage and improving business process are among the top goals of their digital transformation strategy" says Digital News Asia.

This is according to 40% of 800 organisations in 15 countries on five continents that were interviewed for Dimension Data’s Digital Workplace Report: Transforming Your Business which was published today.

Another insight in the Report is that digital transformation is not just about adopting the technologies of the past: 62% of research participants expect to have technology such as virtual advisors in their organisations within the next two years.

In addition, 58% expect to start actively investing in technology that powers virtual advisors in the next two years.

Today, the digital workplace is no longer just made up of managers and those managed; co-workers collaborating with one another to complete projects; and employees interacting with customers and partners. It’s increasingly populated by ‘virtual employees’ who do not exist in a physical sense, but nonetheless play an important role in the organisation.

While artificial intelligence (AI) technology is still in its infancy, it is sufficiently advanced to be working its way into companies in the form of virtual assistants, and, in certain industries such as banking, virtual tellers and virtual advisors.

Manifested as bots embedded into specific applications, virtual assistants draw on AI engines and machine learning technology to respond to basic queries.

Photo:  Kane Steele
“It’s no longer enough to simply implement these technologies,” says End-user Computing general manager Kane Steele.

“Organisations have grown their use of analytics to understand how these technologies impact their business performance:  64% use analytics to improve their customer services, and 58% use analytics to benchmark their workplace technologies.”

Meanwhile, around 30% of organisations said they’re far along in their digital transformation initiatives and are already reaping the benefits, while others are still in the early stages of developing a plan.
Read more...

Source: Digital News Asia

          Combining Open Commerce Datasets to Drive Better Trade Business Intelligence        

“We can each define ambition and progress for ourselves. The goal is to work toward a world where expectations are not set by the stereotypes that hold us back, but by our personal passion, talents and interests.”—Sheryl Sandberg

As we bring private-sector innovators and technologies into our challenge to use public data to solve public problems, it’s striking how many are finding new ways to break through and apply their passions, talents and interests:

  • We have one company that is making our data available free and open on a platform with 700,00 data scientists.
  • We have another company that is wrangling, integrating, and presenting our data with information from a number of other public sources.
  • A third is making Commerce data more accessible via interactive visualizations and filters.
Counselor Justin Antonipillai, Economics & Statistics Administration (left) and Ike Kavas, Founder, Ephesoft

Counselor Justin Antonipillai, Economics & Statistics Administration (left) and Ike Kavas, Founder, Ephesoft

In today’s announcement, we are sharing what Ephesoft will be doing, free and open for the public, to advance the goals of democratizing our data using their technology.

We here at Commerce make a lot of data available in many formats, including in bulk and through our application programming interfaces (APIs). However, some of the data that we make available to the public might be available in pictures that contain the data—like files in “portable document format” (PDF) or in “tagged image file format” (TIFF). Documents in these formats are nearly impossible to derive insights from because the data itself is unstructured and hard to analyze.

In response, Ephesoft has used its digitization and machine learning technology to start extracting meaningful data from these images. Ephesoft first performed a proof-of-concept exercise on data from the US Patent and Trademark Office, by running patent data in image-based PDF format through their platform and identifying fields such as patent date and number.

Once these fields have been identified, Ephesoft’s algorithms extract pertinent data from the images and identify linkage across multiple documents. In its exercise on US patent data, Ephesoft’s resulting mind map visualization displays how one patent is connected to other patents, based on references, citations, and abstracts. This information can be used to analyze bright spots and clusters in US research, as well as identify gaps in patented technology or ‘lonely’ patents in spaces where little other patented art exists.

Before:

Proof-of-concept exercise on data from the US Patent and Trademark Office.

After:

Ephesoft’s mind map visualization displays how one patent is connected to other patents, based on references, citations, and abstracts.

In addition to its ability to extract data from images, Ephesoft can also house large sums of public data to create knowledge bases that allow organizations to compare their unstructured data against a series of benchmarks. For free and open use, their team is now working to combine trade data from the US International Trade Administration, the US Census Bureau, and the Bureau of Economic Analysis at Commerce to develop a public knowledge base for American industry.

This tool will be able to help U.S. businesses answer questions such as:

  • Will regulation requirements impact shifts in my export strategy?
  • Are my export practices compliant with all relevant trade regulations?
  • Which markets are most similar to my current trade portfolio?
  • How does my organization compare to other organizations in the same industry?

What continues to be exciting about these collaborations is that they highlight the many ways that the readers of this blog might bring to help improve the lives of the America people. It’s not about having a specific talent; it’s about how you can use your unique talent to serve.

Data can be used for just about anything these days—ordering a ride, booking a hotel, or determining which political candidate to cast a vote for. When the Commerce Department announced its challenge for private companies to use their technology for public good, we had no idea such a wide variety of organizations would come forward, offering to leverage their unique capabilities for the good of the American people. But they did come forward, in droves, using their individual talents to build tools for our citizens.

We hope you join us.

Thanks for reading.

Justin and Ike


          â€œOmne Vivum Ex Vivo” for Data-Driven Health        

Louis Pasteur’s law of biogenesis (reflected in the above Latin phrase meaning “all life from life”) might well capture how we see our mission at Commerce to democratize and harness public data to drive innovation and help solve big public problems.

To explain, we see in countless ways every day—from patent search that begets new patents, to trade data used to fuel exports—how mining and combining open public data, and combining public data with private data sets, multiplies its value and impact. Just as Pasteur’s microorganisms multiplying in an open flask led to life-saving vaccinations and drugs.

Counselor Justin Antonipillai, Economics & Statistics Administration (left) and Chairman & Co-Founder Marc Dacosta, Enigma

Counselor Justin Antonipillai, Economics & Statistics Administration (left) and Chairman & Co-Founder Marc Dacosta, Enigma

A century after Pasteur’s day, prescription drug development and application is critical to the health of our people, and the strength of our economy. In turn, the volume of US government public data on prescription drugs available today is staggering, and invaluable.

Every year, there are disparate open data sets from public sources including our own Department of Commerce US Patent and Trademark Office (drug patents), National Institutes of Health (clinical trials), the US Food and Drug Administration (“Orange Book” of approved drugs), Adverse Event Reporting, Orphaned products, US Center for Disease Control recalls, the US Centers for Medicare and Medicaid Services, and the European Drug Registry.

The challenge, of course, is that to derive insights and make this digestible, these data sets need to be brought together and tools need to be provided to really unleash the potential. As you know from my recent posts, we here at Commerce have issued a challenge to companies and the private sector to make our constant surge of data more consumable. The initial response is inspiring. We’re onto something big.

Today, I write about another company that has taken on this challenge. Enigma has brought many of these public data sets together—including the Commerce/PTO patent data—and created “A Prescription for Healthcare Data.” This free and open public site will make available to the public “wrangled” data from disparate government sources on over 80 of the most commonly-dispensed drugs in America, and lets users follow the winding path from patent to patient.

“Knowledge,” Dr. Pasteur said, “is the torch which illuminates the world.” Imagine the brilliant lumens of light that our troves of public data on medicine today—if made more useful and useable—could shine on our children’s world tomorrow.

This is exactly why I have such a passion for our mission to democratize our public data. It’s not really about machine learning, algorithms, predictive analytics and the like, as important as these technologies and the talents driving them are. It’s about putting our data assets to work to make life better for more people.

It is terrific that Enigma has taken on this public challenge to make our data more available to more of the country, and we are proud that our PTO data could serve such a critical role in this free and open project.

Here’s a joint piece by Marc and me about the Enigma project.

What are we taking?

About 60 percent of Americans over 20 take at least one prescription drug, the American Medical Association’s journal reports. Over 15 percent take five or more. Pharma is a nearly $400 billion industry in the US alone. Taking prescription medicine is part of daily life, and plays a major role in advancing health and wellness. Naturally, it’s a major healthcare policy and headline issue.

Yet few have a complete picture of the drugs we’re prescribed and rely on. It’s hard to follow the lifecycle of activities and milestones, like: Who developed the drug? What was it designed to address? How was it tested? Who is prescribing or using it?

These questions linger despite the wealth of data generated and compiled in the conception, development and regulation of the drugs, and made public by pharma companies and government agencies. The problem is, this data is fragmented and disbursed by many disparate US government sources in different ways.

How can we know more?

A Prescription for Healthcare Data, a new free and open healthcare data site created by Enigma tries to address this problem.

A Prescription for Healthcare Data

It’s a connected and open data experience that links together existing public datasets from several sources—including open data from our Patent and Trademark Office—on 80 commonly-prescribed drugs that make up about 70 percent of all US pharmaceutical sales.

A Prescription for Healthcare Dataprovides transparency into healthcare data, enabling visitors to trace these prescription drug’s critical milestones from early development to generic usage. By connecting 10 typically fragmented public datasets, the site provides coherent and extensive timelines for each prescription drug, enabling visitors to easily explore and draw insights from these drug lifecycles.

Common Drugs Over Time

Bringing these public datasets together is just the beginning, and others can continue to build upon the site and datasets or create their own sites.

Who might benefit?

Beneficiaries of this wrangling and presentation of public prescription drug data could include:

  • Healthcare policy experts analyzing drug development. They could find out: Is development becoming costlier, and if so, why? Do particular diseases encourage more or less investment and how are they affected by development timelines? Are patent extensions legitimate representations of innovation? Has the Orphan Drug Act succeeded in accelerating innovation? What disease areas face the largest gaps in investment and drug discovery?
Chart showing adverse events of medicine.
  • Pharmaceutical R&D and commercialization analysts seeking to understand competitor activity and drug discovery events could answer: How are competitors accelerating their drug development timelines? How have patent extension regulations altered the pace of novel drug discovery? Which companies lead in drug discovery? Has time-to-first-generic-entry varied over time or by disease area?
  • Medicare analysts tracking drug pricing trends could determine: Is Medicare spending commensurate with the level of investment in drug research? Does Medicare see meaningful prices fluctuations or decreases post-approval? Have generics entrants meaningfully led to reductions in price?

Chance favors the prepared mind, Pasteur said. We hope this data and our efforts to help others harness it will give us all—everywhere—a better chance to live a healthier and happier life.

Thanks for reading

Justin and Marc


          Innovations in Training at Commerce        

Leonardo DaVinci.  Paul Revere.  Ben Franklin.  Vincent Van Gogh.  Elvis Presley.  David Beede.  April Blair. William Hawk. Andrea Julca.  Karlheinz Skowronek. Patricia Tomczyszyn.

Which names don’t seem to belong with the others?  Trick question.  While DaVinci and other names obviously are famous, the backgrounds of all of these women and men have something major in common: They trained as apprentices.

Cover image of the The Benefits and Costs of Apprenticeships: A Business Perspective report.DaVinci and Van Gogh apprenticed as painters.  Revere, as a silversmith and Franklin, as a printer.  Elvis apprenticed not as a singer, but as an electrician.  As for Beede, Blair, Hawk, Julca, Skowronek and Tomczyszyn, they are all Commerce Department employees who trained on the job in data analysis, discovery and visualization at the Commerce Data Academy.  (More on this in a bit.)

This past November included National Apprenticeship Week, and as US Secretary of Labor Thomas Perez said, “Apprenticeships are experiencing a modern renaissance in America because the earn-while-learn model is a win-win proposition for workers looking to punch their ticket to the middle-class and for employers looking to grow and thrive in our modern global economy.”

The first-time study of the business benefits and costs of apprenticeships by ESA’s Office of Chief Economist and Case Western University underscores the point.  Key findings of the full report from 13 case studies include:

  • Companies turned to apprenticeships most often when they simply could not find skilled workers off the street locally.
  • Filling hard-to-fill jobs was the single most common benefit of apprenticeships.
  • Companies adapt apprenticeships to meet their unique needs.

The report also looked at the relative costs and benefits at two companies.  Siemens finds an 8 percent return on investment to its apprenticeship program relative to hiring skilled workers.  The Dartmouth-Hitchcock Medical Centers apprenticeships helped to reduce unpopular, unproductive, expensive overtime work from medical providers and to increase booked hours, which together more than paid for the apprentices.

ApprenticeshipUSA Toolkit - Advancing Apprenticeship as a Workforce StrategyOverall, the companies studied were unanimous in their support of registered apprenticeships.  They found value in the program and identified benefits that more than justified the costs and commitments to the apprentices.

The Labor Department’s ApprenticeshipUSA program offers tools and information to employers and employees about adopting the age-old earn-while-learning model for 21st century workforce needs.

Meanwhile, Commerce is helping to pioneer a hands-on, data-centric instructional model for the federal government by establishing our Commerce Data Academy – winner of the FedScoop 50 Tech Program of the Year for 2016.

Launched by the Commerce Data Service, the Data Academy’s goal is to empower more Commerce employees to make data-driven decisions, advancing the data pillar of Secretary Penny Pritzker’s “Open for Business” strategy and bringing a data-driven approach to modernizing government.

The Academy began as a pilot last January to test demand for training in data science, data engineering and web development training.  The pilot offered classes in agile development, HTML & CSS, Storytelling with Data, and Excel at Excel. 

Commerce Data Academy - Educating and empowering Commerce Employees.

We crossed our fingers for at least 30 pilot enrollees.  We were thrilled to receive 422 registrations, and initial classes posted a nearly 90 percent attendance rate. 

This overwhelming response told us: We needed to officially launch the Academy. 

So we did, and to date, we have presented 20 courses (two more are coming) with topics from agile development, to storytelling with data, to machine learning, to programming in Python. We have more than 1,900 unique Commerce registrants and nearly 1,100 unique attendees so far.

For the truly committed (and talented), the Academy offers an in-depth residency program that features a chance to work alongside data scientists and professional developers within the Commerce Data Service.  Residents must tough through an immersive boot camp to prepare for a three-month detail, but the reward is a chance to work on a high-priority problem or need identified by their home bureaus.

The first class of 13 residents recently graduated, and their bureaus are thrilled with the data-oriented products they have created – several featured at the recent Opportunity Project Demo Day and the Commerce Data Advisory Council (CDAC) meeting – and they’ve been instrumental in many of the Commerce Data Service’s award-winning innovations.

We really appreciated the additional feedback from CDAC’s digital thought leaders who encouraged us to spread the word out about our academy, join the online digital education community, harness the alumni network as it grows, and keep teaching our grads. 

Back Row L-R: Kevin Markham (Instructor, General Assembly); Steven Finkelstein (Census); Dmitri Smith (BIS); William Hawk (ESA); David Garrow (Census); Adam Bray (Instructor, General Assembly); David Beede (ESA). Front Row L-R: Karlheinz Skowronek (PTO), April Blair (PTO), Tanya Shen (BEA), Stephen Devine (EDA), Laura Cutrer (NOAA), and Patricia Tomczyszyn (MBDA). Not pictured: Amanda Reynolds (ITA), Gregory Paige (OS), Andrea Julca (BEA), Jennifer Rimbach (MBDA) Photograph: Dr. Tyrone Grandison.

Back Row L-R: Kevin Markham (Instructor, General Assembly); Steven Finkelstein (Census); Dmitri Smith (BIS); William Hawk (ESA); David Garrow (Census); Adam Bray (Instructor, General Assembly); David Beede (ESA). Front Row L-R: Karlheinz Skowronek (PTO), April Blair (PTO), Tanya Shen (BEA), Stephen Devine (EDA), Laura Cutrer (NOAA), and Patricia Tomczyszyn (MBDA). Not pictured: Amanda Reynolds (ITA), Gregory Paige (OS), Andrea Julca (BEA), Jennifer Rimbach (MBDA) Photograph: Dr. Tyrone Grandison.

Data skills are invaluable in the 21st century digital economy, and we want to do our part to advance America’s economy and competitiveness. 

Jake Schwartz, CEO and co-founder of the education startup General Assembly (a collaborator in setting up the initial Commerce Data Academy courses), recently told CNBC that the number-one skill that employers demand now is around data science, development and analytics.  This demand is driving new opportunities for rewarding jobs and careers in data.  But not just for digital and data professionals.  Everyone, in almost every industry and every job, and every level, including CEOs, will need a basic grounding. 

“An investment in knowledge,” onetime apprentice Ben Franklin said, “pays the best interest.” 

Franklin’s employers who invested in his printing apprenticeship certainly never imagined that he would leverage his trade to help found a nation.  Who knows how the Commerce Data Academy will launch David Beede, April Blair, William Hawk, Andrea Julca, Karlheinz Skowronek, Patricia Tomczyszyn and other alumni – current and future – to a future that could change the world. 

Thanks for reading.

Justin

Justin Antonipillai - Counselor to Secretary Penny Pritzker, with the Delegated Duties of the Under Secretary for Economic Affairs


          PowerArchiver 2017 17.00.91        

PowerArchiver is a professional 64-bit (and 32-bit) compression utility, with support for over 60 formats and exclusive Advanced Codec Pack - .PA format with strongest/fastest compression.

New .PA format has two modes - Optimized Strong and Optimized Fast. It offers best compression on the market due to specialized compressors for pdf/docx/jpeg/exe/text/image/sound formats. Overall .pa format is strongest/fastest format on the market today! Over 15 various codecs and filters work together to lower the size of your files. Exclusive PDF/DOCX/ZIP re-compression - up to 85% lower size. Special data de-duplication filter will significantly compress similar files.

PA is really simple to use, it automatically selects best mode for each file. Machine learning is used to optimize codecs for best speed/compression ratio.

Superior multicore, unlimited size ZIP and ZIPX format support compared to other archivers. Fully compatible with WinZip and SecureZip.

Support for PA, ZIP, RAR, ZIPX, 7-ZIP, CAB, PGP, TAR, XZ, GZIP, BZIP2, ISO (ISO9660 and UDF), ZPAQ, WIM, BH, LHA (LZH), XXE, UUE, yENC, MIME (Base 64), ARJ, ARC, ACE, MSI, NSIS, CHM, over 60 total.

PowerArchiver 256bit AES encryption is FIPS 140-2 validated for government use. Supports Volume Shadow Copy (VSS) and UAC elevation, so you can zip any file on your computer, even in use databases or Outlook PST files. Password Policies allow setup of a minimum password policy/rule, to force users to enter passwords w/proper length and mix of characters. File Wiping wipes your temporary files by using DoD 5220.22-M suggested methods for clearing & sanitizing information on writable media.

PowerArchiver has advanced GUI with beautiful skins, ability to chose many options. It fully supports 4K displays and large DPI. Touchscreen support!

Other features include Encryption with OpenPGP, Backup, Burner, Secure FTP, Convert, Repair, Batch Extract, Batch ZIP, SFX Tool, Compression Profiles, Preview, & much more.


          PowerArchiver 2017 (Portable) 17.00.91        

PowerArchiver is a professional 64-bit (and 32-bit) compression utility, with support for over 60 formats and exclusive Advanced Codec Pack - .PA format with strongest/fastest compression.

New .PA format has two modes - Optimized Strong and Optimized Fast. It offers best compression on the market due to specialized compressors for pdf/docx/jpeg/exe/text/image/sound formats. Overall .pa format is strongest/fastest format on the market today! Over 15 various codecs and filters work together to lower the size of your files. Exclusive PDF/DOCX/ZIP re-compression - up to 85% lower size. Special data de-duplication filter will significantly compress similar files.

PA is really simple to use, it automatically selects best mode for each file. Machine learning is used to optimize codecs for best speed/compression ratio.

Superior multicore, unlimited size ZIP and ZIPX format support compared to other archivers. Fully compatible with WinZip and SecureZip.

Support for PA, ZIP, RAR, ZIPX, 7-ZIP, CAB, PGP, TAR, XZ, GZIP, BZIP2, ISO (ISO9660 and UDF), ZPAQ, WIM, BH, LHA (LZH), XXE, UUE, yENC, MIME (Base 64), ARJ, ARC, ACE, MSI, NSIS, CHM, over 60 total.

PowerArchiver 256bit AES encryption is FIPS 140-2 validated for government use. Supports Volume Shadow Copy (VSS) and UAC elevation, so you can zip any file on your computer, even in use databases or Outlook PST files. Password Policies allow setup of a minimum password policy/rule, to force users to enter passwords w/proper length and mix of characters. File Wiping wipes your temporary files by using DoD 5220.22-M suggested methods for clearing & sanitizing information on writable media.

PowerArchiver has advanced GUI with beautiful skins, ability to chose many options. It fully supports 4K displays and large DPI. Touchscreen support!

Other features include Encryption with OpenPGP, Backup, Burner, Secure FTP, Convert, Repair, Batch Extract, Batch ZIP, SFX Tool, Compression Profiles, Preview, & much more.


          Fast Drawing for Everyone        

Drawing on your phone or computer can be slow and difficult—so we created AutoDraw, a new web-based tool that pairs machine learning with drawings created by talented artists to help you draw.

AutoDraw_1.gif

It works on your phone, computer, or tablet (and it’s free!). So the next time you want to make a birthday card, party invite or just doodle on your phone, it’ll be as easy and fast as everything else on the web.

Fast Drawing for Everyone

If you’re interested in learning more about the magic behind AutoDraw, check out “Quick, Draw!”  (one of our A.I. Experiments). AutoDraw’s suggestion tool uses the same technology to guess what you’re trying to draw.

Big thanks to the artists, designers, illustrators and friends of Google who created original drawings for AutoDraw.

HAWRAF, Design Studio
Erin Butner, Designer
Julia Melograna, Illustrator
Pei Liew, Designer
Simone Noronha, Designer
Tori Hinn, Designer
Selman Design, Creative Studio

If you are interested in submitting your own drawings, you can do that here. We hope that AutoDraw, our latest A.I. Experiment, will make drawing more accessible and fun for everyone.


          100 announcements (!) from Google Cloud Next '17        

San Francisco — What a week! Google Cloud Next ‘17 has come to the end, but really, it’s just the beginning. We welcomed 10,000+ attendees including customers, partners, developers, IT leaders, engineers, press, analysts, cloud enthusiasts (and skeptics). Together we engaged in 3 days of keynotes, 200+ sessions, and 4 invitation-only summits. Hard to believe this was our first show as all of Google Cloud with GCP, G Suite, Chrome, Maps and Education. Thank you to all who were here with us in San Francisco this week, and we hope to see you next year.

If you’re a fan of video highlights, we’ve got you covered. Check out our Day 1 keynote (in less than 4 minutes) and Day 2 keynote (in under 5!).

One of the common refrains from customers and partners throughout the conference was “Wow, you’ve been busy. I can’t believe how many announcements you’ve had at Next!” So we decided to count all the announcements from across Google Cloud and in fact we had 100 (!) announcements this week.

For the list lovers amongst you, we’ve compiled a handy-dandy run-down of our announcements from the past few days:

100-announcements-15

Google Cloud is excited to welcome two new acquisitions to the Google Cloud family this week, Kaggle and AppBridge.

1. Kaggle - Kaggle is one of the world's largest communities of data scientists and machine learning enthusiasts. Kaggle and Google Cloud will continue to support machine learning training and deployment services in addition to offering the community the ability to store and query large datasets.

2. AppBridge - Google Cloud acquired Vancouver-based AppBridge this week, which helps you migrate data from on-prem file servers into G Suite and Google Drive.

100-announcements-4

Google Cloud brings a suite of new security features to Google Cloud Platform and G Suite designed to help safeguard your company’s assets and prevent disruption to your business: 

3. Identity-Aware Proxy (IAP) for Google Cloud Platform (Beta) - Identity-Aware Proxy lets you provide access to applications based on risk, rather than using a VPN. It provides secure application access from anywhere, restricts access by user, identity and group, deploys with integrated phishing resistant Security Key and is easier to setup than end-user VPN.

4. Data Loss Prevention (DLP) for Google Cloud Platform (Beta) - Data Loss Prevention API lets you scan data for 40+ sensitive data types, and is used as part of DLP in Gmail and Drive. You can find and redact sensitive data stored in GCP, invigorate old applications with new sensitive data sensing “smarts” and use predefined detectors as well as customize your own.

5. Key Management Service (KMS) for Google Cloud Platform (GA) - Key Management Service allows you to generate, use, rotate, and destroy symmetric encryption keys for use in the cloud.

6. Security Key Enforcement (SKE) for Google Cloud Platform (GA) - Security Key Enforcement allows you to require security keys be used as the 2-Step verification factor for enhanced anti-phishing security whenever a GCP application is accessed.

7. Vault for Google Drive (GA) - Google Vault is the eDiscovery and archiving solution for G Suite. Vault enables admins to easily manage their G Suite data lifecycle and search, preview and export the G Suite data in their domain. Vault for Drive enables full support for Google Drive content, including Team Drive files.

8. Google-designed security chip, Titan - Google uses Titan to establish hardware root of trust, allowing us to securely identify and authenticate legitimate access at the hardware level. Titan includes a hardware random number generator, performs cryptographic operations in the isolated memory, and has a dedicated secure processor (on-chip).

100-announcements-7

New GCP data analytics products and services help organizations solve business problems with data, rather than spending time and resources building, integrating and managing the underlying infrastructure:

9. BigQuery Data Transfer Service (Private Beta) - BigQuery Data Transfer Service makes it easy for users to quickly get value from all their Google-managed advertising datasets. With just a few clicks, marketing analysts can schedule data imports from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers and YouTube Content and Channel Owner reports.

10. Cloud Dataprep (Private Beta) - Cloud Dataprep is a new managed data service, built in collaboration with Trifacta, that makes it faster and easier for BigQuery end-users to visually explore and prepare data for analysis without the need for dedicated data engineer resources.

11. New Commercial Datasets - Businesses often look for datasets (public or commercial) outside their organizational boundaries. Commercial datasets offered include financial market data from Xignite, residential real-estate valuations (historical and projected) from HouseCanary, predictions for when a house will go on sale from Remine, historical weather data from AccuWeather, and news archives from Dow Jones, all immediately ready for use in BigQuery (with more to come as new partners join the program).

12. Python for Google Cloud Dataflow in GA - Cloud Dataflow is a fully managed data processing service supporting both batch and stream execution of pipelines. Until recently, these benefits have been available solely to Java developers. Now there’s a Python SDK for Cloud Dataflow in GA.

13. Stackdriver Monitoring for Cloud Dataflow (Beta) - We’ve integrated Cloud Dataflow with Stackdriver Monitoring so that you can access and analyze Cloud Dataflow job metrics and create alerts for specific Dataflow job conditions.

14. Google Cloud Datalab in GA - This interactive data science workflow tool makes it easy to do iterative model and data analysis in a Jupyter notebook-based environment using standard SQL, Python and shell commands.

15. Cloud Dataproc updates - Our fully managed service for running Apache Spark, Flink and Hadoop pipelines has new support for restarting failed jobs (including automatic restart as needed) in beta, the ability to create single-node clusters for lightweight sandbox development, in beta, GPU support, and the cloud labels feature, for more flexibility managing your Dataproc resources, is now GA.

100-announcements-9

New GCP databases and database features round out a platform on which developers can build great applications across a spectrum of use cases:

16. Cloud SQL for Postgre SQL (Beta) - Cloud SQL for PostgreSQL implements the same design principles currently reflected in Cloud SQL for MySQL, namely, the ability to securely store and connect to your relational data via open standards.

17. Microsoft SQL Server Enterprise (GA) - Available on Google Compute Engine, plus support for Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability (GA).

18. Cloud SQL for MySQL improvements - Increased performance for demanding workloads via 32-core instances with up to 208GB of RAM, and central management of resources via Identity and Access Management (IAM) controls.

19. Cloud Spanner - Launched a month ago, but still, it would be remiss not to mention it because, hello, it’s Cloud Spanner! The industry’s first horizontally scalable, globally consistent, relational database service.

20. SSD persistent-disk performance improvements - SSD persistent disks now have increased throughput and IOPS performance, which are particularly beneficial for database and analytics workloads. Read these docs for complete details about persistent-disk performance.

21. Federated query on Cloud Bigtable - We’ve extended BigQuery’s reach to query data inside Cloud Bigtable, the NoSQL database service for massive analytic or operational workloads that require low latency and high throughput (particularly common in Financial Services and IoT use cases).

100-announcements-11

New GCP Cloud Machine Learning services bolster our efforts to make machine learning accessible to organizations of all sizes and sophistication:

22.  Cloud Machine Learning Engine (GA) - Cloud ML Engine, now generally available, is for organizations that want to train and deploy their own models into production in the cloud.

23. Cloud Video Intelligence API (Private Beta) - A first of its kind, Cloud Video Intelligence API lets developers easily search and discover video content by providing information about entities (nouns such as “dog,” “flower”, or “human” or verbs such as “run,” “swim,” or “fly”) inside video content.

24. Cloud Vision API (GA) - Cloud Vision API reaches GA and offers new capabilities for enterprises and partners to classify a more diverse set of images. The API can now recognize millions of entities from Google’s Knowledge Graph and offers enhanced OCR capabilities that can extract text from scans of text-heavy documents such as legal contracts or research papers or books.

25. Machine learning Advanced Solution Lab (ASL) - ASL provides dedicated facilities for our customers to directly collaborate with Google’s machine-learning experts to apply ML to their most pressing challenges.

26. Cloud Jobs API - A powerful aid to job search and discovery, Cloud Jobs API now has new features such as Commute Search, which will return relevant jobs based on desired commute time and preferred mode of transportation.

27. Machine Learning Startup Competition - We announced a Machine Learning Startup Competition in collaboration with venture capital firms Data Collective and Emergence Capital, and with additional support from a16z, Greylock Partners, GV, Kleiner Perkins Caufield & Byers and Sequoia Capital.

100-announcements-10

New GCP pricing continues our intention to create customer-friendly pricing that’s as smart as our products; and support services that are geared towards meeting our customers where they are:

28. Compute Engine price cuts - Continuing our history of pricing leadership, we’ve cut Google Compute Engine prices by up to 8%.

29. Committed Use Discounts - With Committed Use Discounts, customers can receive a discount of up to 57% off our list price, in exchange for a one or three year purchase commitment paid monthly, with no upfront costs.

30. Free trial extended to 12 months - We’ve extended our free trial from 60 days to 12 months, allowing you to use your $300 credit across all GCP services and APIs, at your own pace and schedule. Plus, we’re introduced new Always Free products -- non-expiring usage limits that you can use to test and develop applications at no cost. Visit the Google Cloud Platform Free Tier page for details.

31. Engineering Support - Our new Engineering Support offering is a role-based subscription model that allows us to match engineer to engineer, to meet you where your business is, no matter what stage of development you’re in. It has 3 tiers:

  • Development engineering support - ideal for developers or QA engineers that can manage with a response within four to eight business hours, priced at $100/user per month.
  • Production engineering support provides a one-hour response time for critical issues at $250/user per month.
  • On-call engineering support pages a Google engineer and delivers a 15-minute response time 24x7 for critical issues at $1,500/user per month.

32. Cloud.google.com/community site - Google Cloud Platform Community is a new site to learn, connect and share with other people like you, who are interested in GCP. You can follow along with tutorials or submit one yourself, find meetups in your area, and learn about community resources for GCP support, open source projects and more.

100-announcements-8

New GCP developer platforms and tools reinforce our commitment to openness and choice and giving you what you need to move fast and focus on great code.

33. Google AppEngine Flex (GA) - We announced a major expansion of our popular App Engine platform to new developer communities that emphasizes openness, developer choice, and application portability.

34. Cloud Functions (Beta) - Google Cloud Functions has launched into public beta. It is a serverless environment for creating event-driven applications and microservices, letting you build and connect cloud services with code.

35. Firebase integration with GCP (GA) - Firebase Storage is now Google Cloud Storage for Firebase and adds support for multiple buckets, support for linking to existing buckets, and integrates with Google Cloud Functions.

36. Cloud Container Builder - Cloud Container Builder is a standalone tool that lets you build your Docker containers on GCP regardless of deployment environment. It’s a fast, reliable, and consistent way to package your software into containers as part of an automated workflow.

37. Community Tutorials (Beta)  - With community tutorials, anyone can now submit or request a technical how-to for Google Cloud Platform.

100-announcements-9

Secure, global and high-performance, we’ve built our cloud for the long haul. This week we announced a slew of new infrastructure updates. 

38. New data center region: California - This new GCP region delivers lower latency for customers on the West Coast of the U.S. and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

39. New data center region: Montreal - This new GCP region delivers lower latency for customers in Canada and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

40. New data center region: Netherlands - This new GCP region delivers lower latency for customers in Western Europe and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

41. Google Container Engine - Managed Nodes - Google Container Engine (GKE) has added Automated Monitoring and Repair of your GKE nodes, letting you focus on your applications while Google ensures your cluster is available and up-to-date.

42. 64 Core machines + more memory - We have doubled the number of vCPUs you can run in an instance from 32 to 64 and up to 416GB of memory per instance.

43. Internal Load balancing (GA) - Internal Load Balancing, now GA, lets you run and scale your services behind a private load balancing IP address which is accessible only to your internal instances, not the internet.

44. Cross-Project Networking (Beta) - Cross-Project Networking (XPN), now in beta, is a virtual network that provides a common network across several Google Cloud Platform projects, enabling simple multi-tenant deployments.

100-announcements-16

In the past year, we’ve launched 300+ features and updates for G Suite and this week we announced our next generation of collaboration and communication tools.

45. Team Drives (GA for G Suite Business, Education and Enterprise customers) - Team Drives help teams simply and securely manage permissions, ownership and file access for an organization within Google Drive.

46. Drive File Stream (EAP) - Drive File Stream is a way to quickly stream files directly from the cloud to your computer With Drive File Steam, company data can be accessed directly from your laptop, even if you don’t have much space on your hard drive.

47. Google Vault for Drive (GA for G Suite Business, Education and Enterprise customers) - Google Vault for Drive now gives admins the governance controls they need to manage and secure all of their files, including employee Drives and Team Drives. Google Vault for Drive also lets admins set retention policies that automatically keep what’s needed and delete what’s not.

48. Quick Access in Team Drives (GA) - powered by Google’s machine intelligence, Quick Access helps to surface the right information for employees at the right time within Google Drive. Quick Access now works with Team Drives on iOS and Android devices, and is coming soon to the web.

49. Hangouts Meet (GA to existing customers) - Hangouts Meet is a new video meeting experience built on the Hangouts that can run 30-person video conferences without accounts, plugins or downloads. For G Suite Enterprise customers, each call comes with a dedicated dial-in phone number so that team members on the road can join meetings without wifi or data issues.

50. Hangouts Chat (EAP) - Hangouts Chat is an intelligent communication app in Hangouts with dedicated, virtual rooms that connect cross-functional enterprise teams. Hangouts Chat integrates with G Suite apps like Drive and Docs, as well as photos, videos and other third-party enterprise apps.

51. @meet - @meet is an intelligent bot built on top of the Hangouts platform that uses natural language processing and machine learning to automatically schedule meetings for your team with Hangouts Meet and Google Calendar.

52. Gmail Add-ons for G Suite (Developer Preview) - Gmail Add-ons provide a way to surface the functionality of your app or service directly in Gmail. With Add-ons, developers only build their integration once, and it runs natively in Gmail on web, Android and iOS.

53. Edit Opportunities in Google Sheets - with Edit Opportunities in Google Sheets, sales reps can sync a Salesforce Opportunity List View to Sheets to bulk edit data and changes are synced automatically to Salesforce, no upload required.

54. Jamboard - Our whiteboard in the cloud goes GA in May! Jamboard merges the worlds of physical and digital creativity. It’s real time collaboration on a brilliant scale, whether your team is together in the conference room or spread all over the world.

100-announcements-17

Building on the momentum from a growing number of businesses using Chrome digital signage and kiosks, we added new management tools and APIs in addition to introducing support for Android Kiosk apps on supported Chrome devices. 

55. Android Kiosk Apps for Chrome - Android Kiosk for Chrome lets users manage and deploy Chrome digital signage and kiosks for both web and Android apps. And with Public Session Kiosks, IT admins can now add a number of Chrome packaged apps alongside hosted apps.

56. Chrome Kiosk Management Free trial - This free trial gives customers an easy way to test out Chrome for signage and kiosk deployments.

57. Chrome Device Management (CDM) APIs for Kiosks - These APIs offer programmatic access to various Kiosk policies. IT admins can schedule a device reboot through the new APIs and integrate that functionality directly in a third- party console.

58. Chrome Stability API - This new API allows Kiosk app developers to improve the reliability of the application and the system.

100-announcements-2

Attendees at Google Cloud Next ‘17 heard stories from many of our valued customers:

59. Colgate - Colgate-Palmolive partnered with Google Cloud and SAP to bring thousands of employees together through G Suite collaboration and productivity tools. The company deployed G Suite to 28,000 employees in less than six months.

60. Disney Consumer Products & Interactive (DCPI) - DCPI is on target to migrate out of its legacy infrastructure this year, and is leveraging machine learning to power next generation guest experiences.

61. eBay - eBay uses Google Cloud technologies including Google Container Engine, Machine Learning and AI for its ShopBot, a personal shopping bot on Facebook Messenger.

62. HSBC - HSBC is one of the world's largest financial and banking institutions and making a large investment in transforming its global IT. The company is working closely with Google to deploy Cloud DataFlow, BigQuery and other data services to power critical proof of concept projects.

63. LUSH - LUSH migrated its global e-commerce site from AWS to GCP in less than six weeks, significantly improving the reliability and stability of its site. LUSH benefits from GCP’s ability to scale as transaction volume surges, which is critical for a retail business. In addition, Google's commitment to renewable energy sources aligns with LUSH's ethical principles.

64. Oden Technologies - Oden was part of Google Cloud’s startup program, and switched its entire platform to GCP from AWS. GCP offers Oden the ability to reliably scale while keeping costs low, perform under heavy loads and consistently delivers sophisticated features including machine learning and data analytics.

65. Planet - Planet migrated to GCP in February, looking to accelerate their workloads and leverage Google Cloud for several key advantages: price stability and predictability, custom instances, first-class Kubernetes support, and Machine Learning technology. Planet also announced the beta release of their Explorer platform.

66. Schlumberger - Schlumberger is making a critical investment in the cloud, turning to GCP to enable high-performance computing, remote visualization and development velocity. GCP is helping Schlumberger deliver innovative products and services to its customers by using HPC to scale data processing, workflow and advanced algorithms.

67. The Home Depot - The Home Depot collaborated with GCP’s Customer Reliability Engineering team to migrate HomeDepot.com to the cloud in time for Black Friday and Cyber Monday. Moving to GCP has allowed the company to better manage huge traffic spikes at peak shopping times throughout the year.

68. Verizon - Verizon is deploying G Suite to more than 150,000 of its employees, allowing for collaboration and flexibility in the workplace while maintaining security and compliance standards. Verizon and Google Cloud have been working together for more than a year to bring simple and secure productivity solutions to Verizon’s workforce.

100-announcements-3

We brought together Google Cloud partners from our growing ecosystem across G Suite, GCP, Maps, Devices and Education. Our partnering philosophy is driven by a set of principles that emphasize openness, innovation, fairness, transparency and shared success in the cloud market. Here are some of our partners who were out in force at the show:

69. Accenture - Accenture announced that it has designed a mobility solution for Rentokil, a global pest control company, built in collaboration with Google as part of the partnership announced at Horizon in September.

70. Alooma - Alooma announced the integration of the Alooma service with Google Cloud SQL and BigQuery.

71. Authorized Training Partner Program - To help companies scale their training offerings more quickly, and to enable Google to add other training partners to the ecosystem, we are introducing a new track within our partner program to support their unique offerings and needs.

72. Check Point - Check Point® Software Technologies announced Check Point vSEC for Google Cloud Platform, delivering advanced security integrated with GCP as well as their joining of the Google Cloud Technology Partner Program.

73. CloudEndure - We’re collaborating with CloudEndure to offer a no cost, self-service migration tool for Google Cloud Platform (GCP) customers.

74. Coursera - Coursera announced that it is collaborating with Google Cloud Platform to provide an extensive range of Google Cloud training course. To celebrate this announcement  Coursera is offering all NEXT attendees a 100% discount for the GCP fundamentals class.

75. DocuSign - DocuSign announced deeper integrations with Google Docs.

76. Egnyte - Egnyte announced an enhanced integration with Google Docs that will allow our joint customers to create, edit, and store Google Docs, Sheets and Slides files right from within the Egnyte Connect.

77. Google Cloud Global Partner Awards - We recognized 12 Google Cloud partners that demonstrated strong customer success and solution innovation over the past year: Accenture, Pivotal, LumApps, Slack, Looker, Palo Alto Networks, Virtru, SoftBank, DoIT, Snowdrop Solutions, CDW Corporation, and SYNNEX Corporation.

78. iCharts - iCharts announced additional support for several GCP databases, free pivot tables for current Google BigQuery users, and a new product dubbed “iCharts for SaaS.”

79. Intel - In addition to the progress with Skylake, Intel and Google Cloud launched several technology initiatives and market education efforts covering IoT, Kubernetes and TensorFlow, including optimizations, a developer program and tool kits.

80. Intuit - Intuit announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

81. Liftigniter - Liftigniter is a member of Google Cloud’s startup program and focused on machine learning personalization using predictive analytics to improve CTR on web and in-app.

82. Looker - Looker launched a suite of Looker Blocks, compatible with Google BigQuery Data Transfer Service, designed to give marketers the tools to enhance analysis of their critical data.

83. Low interest loans for partners - To help Premier Partners grow their teams, Google announced that capital investment are available to qualified partners in the form of low interest loans.

84. MicroStrategy - MicroStrategy announced an integration with Google Cloud SQL for PostgreSQL and Google Cloud SQL for MySQL.

85. New incentives to accelerate partner growth - We are increasing our investments in multiple existing and new incentive programs; including, low interest loans to help Premier Partners grow their teams, increasing co-funding to accelerate deals, and expanding our rebate programs.

86. Orbitera Test Drives for GCP Partners - Test Drives allow customers to try partners’ software and generate high quality leads that can be passed directly to the partners’ sales teams. Google is offering Premier Cloud Partners one year of free Test Drives on Orbitera.

87. Partner specializations - Partners demonstrating strong customer success and technical proficiency in certain solution areas will now qualify to apply for a specialization. We’re launching specializations in application development, data analytics, machine learning and infrastructure.

88. Pivotal - GCP announced Pivotal as our first CRE technology partner. CRE technology partners will work hand-in-hand with Google to thoroughly review their solutions and implement changes to address identified risks to reliability.

89. ProsperWorks - ProsperWorks announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

90. Qwiklabs - This recent acquisition will provide Authorized Training Partners the ability to offer hands-on labs and comprehensive courses developed by Google experts to our customers.

91. Rackspace - Rackspace announced a strategic relationship with Google Cloud to become its first managed services support partner for GCP, with plans to collaborate on a new managed services offering for GCP customers set to launch later this year.

92. Rocket.Chat - Rocket.Chat, a member of Google Cloud’s startup program, is adding a number of new product integrations with GCP including Autotranslate via Translate API, integration with Vision API to screen for inappropriate content, integration to NLP API to perform sentiment analysis on public channels, integration with GSuite for authentication and a full move of back-end storage to Google Cloud Storage.

93. Salesforce - Salesforce announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

94. SAP - This strategic partnership includes certification of SAP HANA on GCP, new G Suite integrations and future collaboration on building machine learning features into intelligent applications like conversational apps that guide users through complex workflows and transactions.

95. Smyte - Smyte participated in the Google Cloud startup program and protects millions of actions a day on websites and mobile applications. Smyte recently moved from self-hosted Kubernetes to Google Container Engine (GKE).

96. Veritas - Veritas expanded its partnership with Google Cloud to provide joint customers with 360 Data Management capabilities. The partnership will help reduce data storage costs, increase compliance and eDiscovery readiness and accelerate the customer’s journey to Google Cloud Platform.

97. VMware Airwatch - Airwatch provides enterprise mobility management solutions for Android and continues to drive the Google Device ecosystem to enterprise customers.

98. Windows Partner Program- We’re working with top systems integrators in the Windows community to help GCP customers take full advantage of Windows and .NET apps and services on our platform.

99. Xplenty - Xplenty announced the addition of two new services from Google Cloud into their available integrations: Google Cloud Spanner and Google Cloud SQL for PostgreSQL.

100. Zoomdata - Zoomdata announced support for Google’s Cloud Spanner and PostgreSQL on GCP, as well as enhancements to the existing Zoomdata Smart Connector for Google BigQuery. With these new capabilities Zoomdata offers deeply integrated and optimized support for Google Cloud Platform’s Cloud Spanner, PostgreSQL, Google BigQuery, and Cloud DataProc services.

We’re thrilled to have so many new products and partners that can help all of our customers grow. And as our final announcement for Google Cloud Next ’17 — please save the date for Next 2018: June 4–6 in San Francisco.

I guess that makes it 101. :-)



          A new generation of Chromebooks, designed for millions of students and educators        

Editor’s Note: At Bett, one of the largest education technology conferences in the world, we're announcing new Chromebooks designed for education. Check out @GoogleForEdu and #BETT2017 to follow along.

When I was a student, I juggled different tools throughout my day—a paper notebook for history, a shared desktop for writing, and a graphing calculator for math. In the years since, computers have begun to replace the need for those various tools—what we did on that calculator can now be done with an app, for example—allowing new possibilities for teaching and learning. Through our tools and devices, we try to help these possibilities come to life. Today both Chromebooks and Classroom are used by more than 20 million teachers and students, and we’re excited to announce that more than 70 million people actively use G Suite for Education.

Chromebooks have been the device of choice for educators because of their simplicity, security, shareability and low cost. And at Bett this week we're introducing a new generation of Chromebooks designed to adapt to the many ways students learn. Look out for new Chromebooks from Acer, Asus, HP, Dell, and Lenovo in addition to the recently announced Samsung Chromebooks—a powerful option for educators. With new apps, stylus and touch capabilities, we expect our partners will continue to build an even wider variety of Chromebooks in the future, including detachables and tablets.

More versatile Chromebooks

At Bett we’re featuring two devices: the Acer Chromebook Spin 11 and the Asus Chromebook C213, arriving late spring. We worked with educators and partners to design these Chromebooks for the specific needs of schools:

  • Stylus capability: Both Chromebooks come with an intelligent, affordable stylus for student note-taking and drawing. The low-cost pens resemble #2 pencils with a unique eraser for correcting mistakes and don’t need charging or pairing, so they can be shared and easily replaced if lost. These Chromebooks use an input prediction model built using Google's machine learning to ensure writing is extremely responsive. And with Optical Character Recognition in apps like Google Keep you can easily search handwritten notes.

Our math department was keen to get tablets so students could write out equations. Stylus on Chromebooks will be a massive help for mathematics. Roger Nixon, ICT Director Wheatley Park School, Oxford

  • World-facing camera: Schools everywhere have asked for world-facing cameras so students can use Chromebooks to capture photo and video from all directions. We carefully designed the camera on the keyboard side, so when a Chromebook is flipped, the camera faces outwards and students can hold it like a tablet.
  • USB-C charging: We heard from educators that multiple chargers and slow charging wastes precious time for students. Going forward, all Chromebooks will have standard super-fast USB-C charging, so one Chromebook cart can charge any device quickly.

A world of content on Chromebooks

Now educators have even more ways to find great educational content on Chromebooks:

From teaching design concepts to visual storytelling, Adobe apps on Chromebooks will open up avenues for our students. Kelly Kermode, Teacher Forest Hills Public Schools, Michigan
  • Creative apps: Today we‘re also announcing that creative apps on Chromebooks—WeVideo, Soundtrap, and Explain Everything—are available in the U.K. and Nordics at a discount from resellers XMA, Lin Education and Avalon Solutions when purchased as a bundle.

Recent updates to Google Classroom

On all Chromebooks, students and educators can use Google Classroom to collaborate, stay organized and save time. The Classroom Android app, now available on Chromebooks, opens up new possibilities to students in how they use their devices. With the help of a stylus-enabled Chromebook, students can complete their math homework by hand or sketch a visual for a science project by annotating documents directly in the Classroom app.

Students, teachers and administrators can also use their Chromebooks to try out the new Classroom features we rolled out earlier this month. Now, teachers can assign work to a subset of students, rather than just the entire class, and use new types of Classroom notifications to manage assignments. For administrators, we now offer more insight into how Classroom is used, with Classroom metrics in Admin Console reports.

We believe in the power of technology to help students learn how they learn best and teachers teach the way they find most effective. We’ll continue to work with educators in 2017 to build tools that support the important work they do every day.


          Data-driven crime prediction fails to erase human bias        

Poor, minority communities flagged as drug crime trouble spots in case study

Science & the Public
photo of police car

BIG DATA DOESN’T PAY  Software programs that use police records to predict crime hot spots may result in police unfairly targeting low-income and minority communities, a new study shows.

Big data is everywhere these days and police departments are no exception. As law enforcement agencies are tasked with doing more with less, many are using predictive policing tools. These tools feed various data into algorithms to flag people likely to be involved with future crimes or to predict where crimes will occur.

In the years since Time magazine named predictive policing as one of 2011’s best 50 inventions of the year, its popularity has grown. Twenty U.S. cities, including Chicago, Atlanta, Los Angeles and Seattle are using a predictive policing system, and several more are considering it. But with the uptick in use has come a growing chorus of caution. Community activists, civil rights groups and even some skeptical police chiefs have raised concerns that predictive data approaches may unfairly target some groups of people more than others.

New research by statistician Kristian Lum provides a telling case study. Lum, who leads the policing project at the San Francisco-based Human Rights Data Analysis Group, looked at how the crime-mapping program PredPol would perform if put to use in Oakland, Calif. PredPol, which purports to “eliminate profiling concerns,” takes data on crime type, location and time and feeds it into a machine-learning algorithm. The algorithm, originally based on predicting seismic activity after an earthquake, trains itself with the police crime data and then predicts where future crimes will occur.

Lum was interested in bias in the crime data — not political or racial bias, just the ordinary statistical kind. While this bias knows no color or socioeconomic class, Lum and her HRDAG colleague William Isaac demonstrate that it can lead to policing that unfairly targets minorities and those living in poorer neighborhoods.

By applying the algorithm to 2010 data on drug crime reports for Oakland, the researchers generated a predicted rate of drug crime on a map of the city for every day of 2011. The researchers then compared the data used by the algorithm — drug use documented by the police — with a record of overall drug use, whether recorded or not. This ground-truthing came from taking public health data from the 2011 National Survey on Drug Use and Health and demographic data from the city of Oakland to derive an estimate of drug use for all city residents.

Story continues below maps

Wheredunit

Drug use in Oakland is probably fairly widespread (left) based on estimates derived in part from the 2011 National Survey on Drug Use and Health. But police records of drug reports and crimes are concentrated in areas that are largely nonwhite and low-income (right).

graphs of drug use and crime reports in Oakland

In this public health-based map, drug use is widely distributed across the city. In the predicted drug crime map, it is not. Instead, drug use deemed worthy of police attention is concentrated in neighborhoods in West Oakland and along International Boulevard, two predominately low-income and nonwhite areas.

Predictive policing approaches are often touted as eliminating concerns about police profiling. But rather than correcting bias, the predictive model exacerbated it, Lum said during a panel on data and crime at the American Association for the Advancement of Science annual meeting in Boston in February. While estimates of drug use are pretty even across race, the algorithm would direct Oakland police to locations that would target black people at roughly twice the rate of whites. A similar disparity emerges when analyzing by income group: Poorer neighborhoods get targeted.

Shifting target

While drug use estimated from public health data is roughly equivalent across racial classifications (top), police using a predictive policing algorithm in Oakland, Calif., would target black people at roughly twice the rate of whites (bottom).

And a troubling feedback loop emerges when police are sent to targeted locations. If police find slightly more crime in an area because that’s where they’re concentrating patrols, these crimes become part of the dataset that directs where further patrolling should occur. Bias becomes amplified, hot spots hotter.

There’s nothing wrong with PredPol’s algorithm, Lum notes. Machine learning algorithms learn patterns and structure in data. “The algorithm did exactly what we asked; it learned patterns in the data,” she says. The danger is in thinking that predictive policing will tell you about patterns in the occurrence of crime. It’s really telling you about patterns in police records.

Police aren’t tasked with collecting random samples, nor should they be, says Lum. And that’s all the more reason why departments should be transparent and vigilant about how they use their data. In some ways, PredPol-guided policing isn’t so different from old-fashioned pins on a map.

For her part, Lum would prefer that police stick to these timeworn approaches. With pins on a map, the what, why and where of the data are very clear. The black box of an algorithm, on the other hand, lends undue legitimacy to the police targeting certain locations while simultaneously removing accountability. “There’s a move toward thinking machine learning is our savior,” says Lum. “You hear people say, “A computer can’t be racist.’”

The use of predictive policing may be costly, both literally and figuratively. The software programs can run from $20,000 to up to $100,000 per year for larger cities. It’s harder to put numbers on the human cost of over-policing, but the toll is real. Increased police scrutiny can lead to poor mental health outcomes for residents and undermine relationships between police and the communities they serve. Big data doesn’t help when it’s bad data.


          IDG Contributor Network: How AI is transforming healthcare for the benefit of patients        

The development of artificial intelligence has made staggering leaps forward in recent years, with products like Apple’s Siri and Amazon’s Alexa now dotting living rooms and businesses across the nation. While AI has wasted little time in shaking up the foundations of most established industries, the field of healthcare, in particular, stands to be fundamentally transformed by this burgeoning technology.

So what exactly does the future of AI hold for the healthcare industry? How are doctors and industry insiders preparing themselves, and what might it mean for patients' futures?

Harnessing the power of machines

More and more prudent investors are realizing that emerging data analytics capabilities are only the start of a forthcoming revolution. As health records are increasingly digitized, doctors and nurses will find themselves capable of ordering AI programs to sift through huge swaths of data to find meaningful trends that lie below the surface. Machine learning and artificial intelligence are able to perfectly archive and retrieve even the most complex sets of data, often doing so with greater efficiency than humans.

To read this article in full or to leave a comment, please click here


          Privacy Metrics        
Along with a colleage - Dr. Yoan Miche - we presented a paper outlining ideas regarding using mutual information as a metric for establishing some form of  'legal compliance' for data sets. The work is far from complete and the mathematics is getting horrendous!

The paper entitled "On the Development of A Metric for Quality of Information Content over Anonymised Data-Sets" was presented at the excellent Quatic 2016 conference held in Lisbon, Sept 6-9, 2016.

We were also extremely fortunate in that a presented in our session didn't turn up and we were graciously given the full hour not just to present the paper but give a much fuller background and details of future work and the state of our current results.

Here are the slides:

On The Development of A Metric for Quality of Information Content over Anonymised Data Sets from Ian Oliver

Abstract:

We propose a framework for measuring the impact of data anonymisation and obfuscation in information theoretic and data mining terms. Privacy functions often hamper machine learning but obscuring the classification functions. We propose to
use Mutual Information over non-Euclidean spaces as a means of measuring the distortion induced by privacy function and following the same principle, we also propose to use Machine Learning techniques in order to quantify the impact of said obfuscation in terms of further data mining goals.

Citation:

Ian Oliver and Yoan Miche (2016) On the Development of A Metric for Quality of
Information Content over Anonymised Data-Sets
. Quatic 2016, Lisbon, Portugal, Sept 6-9, 2016.
          Will Machines Take Over the World?        
The science questions that you've been sending in get scrutinised and analysed by biologist Sarah Harrison, statistician Simon White, mental health expert Olivia Remes and machine learning guru Peter Clarke. Find out why smaller dogs live longer than bigger breeds, why some people are more susceptible to hayfever, whether machines are destined to take control of the world, and what science says will make you happy...
          Trece modelos de aprendizaje automático (parte I de IV)        
En los últimos años, el Aprendizaje Automático, o 'machine learning', se ha colocado en la primera línea de la actualidad de las tecnologías emergentes. Esta tendencia ha crecido enormemente con la explosión de infraestructuras y metodologías que facilitan llevar a cabo desarrollos orientados al aprendizaje de máquinas.
          Fitness Meets the Future: Unique New iOS App “ZANUM” is Next Best Thing to a Real-Life Trainer        

ZANUM was developed in partnership with professional real-life personal trainers, who expertly crafted more than 500 detailed exercises from over 100,000 hours of coaching experience. In total, there are 11 unique coaches in the “ZANUM League” to guide the way, and the app uses machine learning to adjust to each individual user’s development on a session-by-session basis.

(PRWeb January 21, 2017)

Read the full story at http://www.prweb.com/releases/appshout/zanum/prweb14000264.htm


          Flipkart intends to launch new fintech business categories, hires Silicon Valley execs        
Flipkart, which already has plans to enter categories like furniture and groceries is now looking to tap the fintech sector as well. Currently, India’s largest online marketplace, the e-commerce company is looking to hire executives from Silicon Valley to assist them in entering areas of AI (artificial intelligence) and ML (machine learning) while moving away […]
          Data & Society Databite #101: Machine Learning: What’s Fair and How Do We Decide?, by Suchana Seth        

The question is what are we doing in the industry, or what is the machine learning research community doing, to combat instances of algorithmic bias? So I think there is a certain amount of good news, and it's the good news that I wanted to focus on in my talk today.

Data & Society Databite #101: Machine Learning: What’s Fair and How Do We Decide? appeared first on Open Transcripts.
You can help support this project at Patreon or via cash.me


          Artificial Intelligence: Challenges of Extended Intelligence, by Joi Ito        

Machine learning systems that we have today have become so powerful and are being introduced into everything from self-driving cars, to predictive policing, to assisting judges, to producing your news feed on Facebook on what you ought to see. And they have a lot of societal impacts. But they're very difficult to audit.

Artificial Intelligence: Challenges of Extended Intelligence appeared first on Open Transcripts.
You can help support this project at Patreon or via cash.me


          La blockchain, nouvel ange-gardien des sociétés d'assurance         
Secuobs.com : 2016-04-12 16:37:09 - Global Security Mag Online - Née avec l'avènement du bitcoin et des crypto-monnaies, la blockchain - technologie d'enregistrement des transactions sur un grand livre sécurisé, peut trouver des applications dans tous les secteurs d'activités Elle doit contribuer pleinement à la révolution que connaît aujourd'hui le marché de l'assurance et représente à l'évidence une belle opportunité pour réinventer notre industrie Big data, dématérialisation ou encore machine learning, la digitalisation a créé un séisme dans de nombreux pans de - Points de Vue
          Building “the switch” using machine learning        
If you have been around algorithmic trading for a while you have probably heard some version of the “switch” concept. This is one of the holy grails of systematic trading, describing an ability to be able to change the way one acts in the market according to market conditions. Today I want to talk about […]
          Issues we have faced with our Machine Learning system repository        
Our methodology to find supervised machine learning strategies in an automated manner has traveled a rough road from the start. Our first attempt – which started in 2015 – faced significant software based problems that forced us to eliminate all the systems we had mined and start over again and our current effort hasn’t had an […]
           Solving ill-posed inverse problems using iterative deep neural networks / Jobs: 2 Postdocs @ KTH, Sweden - implementation -        
Ozan just sent me the following e-mail. It has the right mix of elements of The Great Convergence by applying learning to learn methods to inverse problems that are some of the problems we thought compressive sensing could solve well (CT tomography), papers suporting those results, an implementation, a blog entry and two postdoc jobs. Awesome !
Dear Igor,


I have for some time followed your excellent blog Nuit Blanche. I'm not familiar with how you select entries for Nuit Blanche, but let me take the opportunity to provide potential input for related to Nuit Blanche on the exciting research we pursue at the Department of Mathematics, KTH Royal Institute of Technology. If you find any of this interesting, please feel free to post it on Nuit Blanche.


1. Deep learning and tomographic image reconstruction
The main objective for the research is to develop theory and algorithms for 3D tomographic reconstruction. An important recent development has been to use techniques from deep learning to solve inverse problems. We have developed a rather generic, yet adaptable, framework that combines elements of variational regularization with machine learning for solving large scale inverse problems. More precisely, the idea is to learn a reconstruction scheme by making use of the forward operator, noise model and other a priori information. This goes beyond learning a denoiser where one first performs an initial (non machine-learning) reconstruction and then uses machine learning on the resulting image-to-image (denoising) problem. Several groups have done learning a denoiser and the results are in fact quite remarkable, outperforming previous state of the art methods. Our approach however combines reconstruction and denoising steps which further improves the results. The following two arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 provide more details, there is also a blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html by one of our PhD students that explains this idea of "learning to reconstruct".


2. Post doctoral fellowships
I'm looking for two 2-year post-doctoral fellowships, one dealing with regularization of spatiotemporal and/or multichannel images and the other with methods for combining elements of variational regularization with deep learning for solving inverse problems. The announcements are given below. I would be glad if you could post these also on your blog.


Postdoctoral fellow in PET/SPECT Image Reconstruction (S-2017-1166)
Deadline: December 1, 2017
Brief description:
The position includes research & development of algorithms for PET and SPECT image reconstruction. Work is closely related to on-going research on (a) multi-channel regularization for PET/CT and SPECT/CT imaging, (b) joint reconstruction and image matching for spatio-temporal pulmonary PET/CT and cardiac SPECT/CT imaging, and (c) task-based reconstruction by iterative deep neural networks. An important part is to integrate routines for forward and backprojection from reconstruction packages like STIR and EMrecon for PET and NiftyRec for SPECT with ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.
Announcement & instructions:
http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158920/type:job/where:4/apply:1

Postdoctoral fellow in Image Reconstruction/Deep Dictionary Learning (S-2017-1165)
Deadline: December 1, 2017
Brief description:

The position includes research & development of theory and algorithms that combine methods from machine learning with sparse signal processing for joint dictionary design and image reconstruction in tomography. A key element is to design dictionaries that not only yield sparse representation, but also contain discriminative information. Methods will be implemented in ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction which enables one to utilize the existing integration between ODL and TensorFlow. The research is part of a larger effort that aims to combine elements of variational regularization with machine learning for solving large scale inverse problems, see the arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 and the blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html for further details. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.Announcement & instructions:
http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158923/type:job/where:4/apply:1




Best regards,
Ozan


--

Assoc. Prof. Ozan Öktem
Director, KTH Life Science Technology Platform
Web: http://ww.kth.se/lifescience


Department of Matematics
KTH Royal Institute of Technology
SE-100 44 Stockholm, Sweden
E-mail: ozan@kth.se




Learned Primal-dual Reconstruction by Jonas Adler, Ozan Öktem

We propose a Learned Primal-Dual algorithm for tomographic reconstruction. The algorithm includes the (possibly non-linear) forward operator in a deep neural network inspired by unrolled proximal primal-dual optimization methods, but where the proximal operators have been replaced with convolutional neural networks. The algorithm is trained end-to-end, working directly from raw measured data and does not depend on any initial reconstruction such as FBP.
We evaluate the algorithm on low dose CT reconstruction using both analytic and human phantoms against classical reconstruction given by FBP and TV regularized reconstruction as well as deep learning based post-processing of a FBP reconstruction.
For the analytic data we demonstrate PSNR improvements of >10 dB when compared to both TV reconstruction and learned post-processing. For the human phantom we demonstrate a 6.6 dB improvement compared to TV and a 2.2 dB improvement as compared to learned post-processing. The proposed algorithm also improves upon the compared algorithms with respect to the SSIM and the evaluation time is approximately 600 ms for a 512 x 512 pixel dataset.  

Solving ill-posed inverse problems using iterative deep neural networks by Jonas Adler, Ozan Öktem
We propose a partially learned approach for the solution of ill posed inverse problems with not necessarily linear forward operators. The method builds on ideas from classical regularization theory and recent advances in deep learning to perform learning while making use of prior information about the inverse problem encoded in the forward operator, noise model and a regularizing functional. The method results in a gradient-like iterative scheme, where the "gradient" component is learned using a convolutional network that includes the gradients of the data discrepancy and regularizer as input in each iteration. We present results of such a partially learned gradient scheme on a non-linear tomographic inversion problem with simulated data from both the Sheep-Logan phantom as well as a head CT. The outcome is compared against FBP and TV reconstruction and the proposed method provides a 5.4 dB PSNR improvement over the TV reconstruction while being significantly faster, giving reconstructions of 512 x 512 volumes in about 0.4 seconds using a single GPU.
An implementation is here: https://github.com/adler-j/learned_gradient_tomography
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          SPORCO: A Python package for standard and convolutional sparse representations - implementation -        



Brendt just sent me the following:

Hi Igor,
I noticed that you maintain an extensive list of software tools for sparse representations and related problems. Could you please add a reference to SPORCO, which is a relatively new library providing algorithms for sparse coding and dictionary learning? It supports standard sparse representations as well as a variety of other problems, including ℓ1-TV and ℓ2-TV regularization and Robust PCA, but the major strength is in algorithms for convolutional sparse coding and dictionary learning (the form of sparse coding inspired by deconvolutional networks). 
A Matlab version is available at 

but development is now focused on the Python version, available on GitHub at

The Python version features an object-oriented design that allows the existing ADMM algorithms to be extended or modified with limited effort, as described in some detail in paper presented at the recent SciPy conference 

Thanks,
Brendt

Thanks Brendt ! Let me add this to the Advanced Matrix Factorization Jungle page in the coming days. In the meantime, here is the paper:  


SPORCO: A Python package for standard and convolutional sparse representations by Brendt Wohlberg

SParse Optimization Research COde (SPORCO) is an open-source Python package for solving optimization problems with sparsity-inducing regularization, consisting primarily of sparse coding and dictionary learning, for both standard and convolutional forms of sparse representation. In the current version, all optimization problems are solved within the Alternating Direction Method of Multipliers (ADMM) framework. SPORCO was developed for applications in signal and image processing, but is also expected to be useful for problems in computer vision, statistics, and machine learning.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Papers: ICML2017 workshop on Implicit Models        
The organizers (David Blei, Ian Goodfellow, Balaji Lakshminarayanan, Shakir Mohamed, Rajesh Ranganath, Dustin Tran ) made the papers of the ICML2017 workshop on Implicit Models available here:

  1. A-NICE-MC: Adversarial Training for MCMC Jiaming Song, Shengjia Zhao, Stefano Ermon
  2. ABC-GAN: Adaptive Blur and Control for improved training stability of Generative Adversarial Networks Igor Susmelj, Eirikur Agustsson, Radu Timofte
  3. Adversarial Inversion for Amortized Inference Zenna Tavares, Armando Solar Lezama
  4. Adversarial Variational Inference for Tweedie Compound Poisson Models Yaodong Yang, Sergey Demyanov, Yuanyuan Liu, Jun Wang
  5. Adversarially Learned Boundaries in Instance Segmentation Amy Zhang
  6. Approximate Inference with Amortised MCMC Yingzhen Li, Richard E. Turner, Qiang Liu
  7. Can GAN Learn Topological Features of a Graph? Weiyi Liu, Pin-Yu Chen, Hal Cooper, Min Hwan Oh, Sailung Yeung, Toyotaro Suzumura
  8. Conditional generation of multi-modal data using constrained embedding space mapping Subhajit Chaudhury, Sakyasingha Dasgupta, Asim Munawar, Md. A. Salam Khan and Ryuki Tachibana
  9. Deep Hybrid Discriminative-Generative Models for Semi-Supervised Learning Volodymyr Kuleshov, Stefano Ermon
  10. ELFI, a software package for likelihood-free inference Jarno Lintusaari, Henri Vuollekoski, Antti Kangasrääsiö, Kusti Skyten, Marko Järvenpää, Michael Gutmann, Aki Vehtari, Jukka Corander, Samuel Kaski
  11. Flow-GAN: Bridging implicit and prescribed learning in generative models Aditya Grover, Manik Dhar, Stefano Ermon
  12. GANs Powered by Autoencoding — A Theoretic Reasoning Zhifei Zhang, Yang Song, and Hairong Qi
  13. Geometric GAN Jae Hyun Lim and Jong Chul Ye
  14. Gradient Estimators for Implicit Models Yingzhen Li, Richard E. Turner
  15. Implicit Manifold Learning on Generative Adversarial Networks Kry Yik Chau Lui, Yanshuai Cao, Maxime Gazeau, Kelvin Shuangjian Zhang
  16. Implicit Variational Inference with Kernel Density Ratio Fitting Jiaxin Shi, Shengyang Sun, Jun Zhu
  17. Improved Network Robustness with Adversarial Critic Alexander Matyasko, Lap-Pui Chau
  18. Improved Training of Wasserstein GANs Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville
  19. Inference in differentiable generative models Matthew M. Graham and Amos J. Storkey
  20. Joint Training in Generative Adversarial Networks R Devon Hjelm, Athul Paul Jacob, Yoshua Bengio
  21. Latent Space GANs for 3D Point Clouds Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, Leonidas Guibas
  22. Likelihood Estimation for Generative Adversarial Networks Hamid Eghbal-zadeh, Gerhard Widmer
  23. Maximizing Independence with GANs for Non-linear ICA Philemon Brakel, Yoshua Bengio 
  24. Non linear Mixed Effects Models: Bridging the gap between Independent Metropolis Hastings and Variational Inference Belhal Karimi
  25. Practical Adversarial Training with Empirical Distribution Ambrish Rawat, Mathieu Sinn, Maria-Irina Nicolae
  26. Recursive Cross-Domain Facial Composite and Generation from Limited Facial Parts Yang Song, Zhifei Zhang, Hairong Qi
  27. Resampled Proposal Distributions for Variational Inference and Learning Aditya Grover, Ramki Gummadi, Miguel Lazaro-Gredil, Dale Schuurmans, Stefano Ermon
  28. Rigorous Analysis of Adversarial Training with Empirical Distributions Mathieu Sinn, Ambrish Rawat, Maria-Irina Nicolae
  29. Robust Controllable Embedding of High-Dimensional Observations of Markov Decision Processes Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, Hung Bui
  30. Spectral Normalization for Generative Adversarial Network Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida
  31. Stabilizing the Conditional Adversarial Network by Decoupled Learning Zhifei Zhang, Yang Song, and Hairong Qi
  32. Stabilizing Training of Generative Adversarial Networks through Regularization Kevin Roth, Aurelien Lucchi, Sebastian Nowozin & Thomas Hofmann
  33. Stochastic Reconstruction of Three-Dimensional Porous Media using Generative Adversarial Networks Lukas Mosser, Olivier Dubrule, Martin J. Blunt
  34. The Amortized Bootstrap Eric Nalisnick, Padhraic Smyth 
  35. The Numerics of GANs Lars Mescheder, Sebastian Nowozin, Andreas Geiger
  36. Towards the Use of Gaussian Graphical Models in Variational Autoencoders Alexandra Pește, Luigi Malagò
  37. Training GANs with Variational Statistical Information Minimization Michael Ghaben
  38. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks Jun-Yan Zhu*, Taesung Park*, Phillip Isola, Alexei A. Efros
  39. Unsupervised Domain Adaptation Using Approximate Label Matching Jordan T. Ash, Robert E. Schapire, Barbara E. Englhardt
  40. Variance Regularizing Adversarial Learning Karan Grewal, R Devon Hjelm, Yoshua Bengio
  41. Variational Representation Autoencoders to Reduce Mode Collapse in GANs Akash Srivastava, Lazar Valkov, Chris Russell, Michael U. Gutmann, Charles Sutton


Dougal Sutherland, Evaluating and Training Implicit Generative Models with Two-Sample Tests
Samples from implicit generative models are difficult to judge quantitatively: particularly for images, it is typically easy for humans to identify certain kinds of samples which are very unlikely under the reference distribution, but very difficult for humans to identify when modes are missing, or when types are merely under- or over-represented. This talk will overview different approaches towards evaluating the output of an implicit generative model, with a focus on identifying ways in which the model has failed. Some of these approaches also form the basis for the objective functions of GAN variants which can help avoid some of the issues of stability and mode-dropping in the original GAN.
Kerrie Mengerson, Probabilistic Modelling in the Real World
Interest is intensifying in the development and application of Bayesian approaches to estimation of real-world processes using probabilistic models. This presentation will focus on three substantive case studies in which we have been involved: protecting the Great Barrier Reef in Australia from impacts such as crown of thorns starfish and industrial dredging, reducing congestion at international airports, and predicting survival of jaguars in the Peruvian Amazon. Through these examples, we will explore current ideas about Approximate Bayesian Computation, Populations of Models, Bayesian priors and p-values, and Bayesian dynamic networks.

Sanjeev Arora, Do GANs actually learn the distribution? Some theory and empirics
The Generative Adversarial Nets or GANs framework (Goodfellow et al'14) for learning distributions differs from older ideas such as autoencoders and deep Boltzmann machines in that it scores the generated distribution using a discriminator net, instead of a perplexity-like calculation. It appears to work well in practice, e.g., the generated images look better than older techniques. But how well do these nets learn the target distribution?
Our paper 1 (ICML'17) shows GAN training may not have good generalization properties; e.g., training may appear successful but the trained distribution may be far from target distribution in standard metrics. We show theoretically that this can happen even though the 2-person game between discriminator and generator is in near-equilibrium, where the generator appears to have "won" (with respect to natural training objectives).
Paper2 (arxiv June 26) empirically tests whether this lack of generalization occurs in real-life training. The paper introduces a new quantitative test for diversity of a distribution based upon the famous birthday paradox. This test reveals that distributions learnt by some leading GANs techniques have fairly small support (i.e., suffer from mode collapse), which implies that they are far from the target distribution.
Paper 1: "Equilibrium and Generalization in GANs" by Arora, Ge, Liang, Ma, Zhang. (ICML 2017)
Paper 2: "Do GANs actually learn the distribution? An empirical study." by Arora and Zhang (https://arxiv.org/abs/1706.08224)

Stefano Ermon, Generative Adversarial Imitation Learning
Consider learning a policy from example expert behavior, without interaction with the expert or access to a reward or cost signal. One approach is to recover the expert’s cost function with inverse reinforcement learning, then compute an optimal policy for that cost function. This approach is indirect and can be slow. In this talk, I will discuss a new generative modeling framework for directly extracting a policy from data, drawing an analogy between imitation learning and generative adversarial networks. I will derive a model-free imitation learning algorithm that obtains significant performance gains over existing methods in imitating complex behaviors in large, high-dimensional environments. Our approach can also be used to infer the latent structure of human demonstrations in an unsupervised way. As an example, I will show a driving application where a model learned from demonstrations is able to both produce different driving styles and accurately anticipate human actions using raw visual inputs.
Qiang Liu
Wild Variational Inference with Expressive Variational Families
Variational inference (VI) provides a powerful tool for reasoning with highly complex probabilistic models in machine learning. The basic idea of VI is to approximate complex target distributions with simpler distributions found by minimizing the KL divergence within some predefined parametric families. A key limitation of the typical VI techniques, however, is that they require the variational family to be simple enough to have tractable likelihood functions, which excludes a broad range of flexible, expressive families such as these defined via implicit models. In this talk, we will discuss a general framework for (wild) variational inference that works for much more expressive, implicitly defined variational families with intractable likelihood functions. Our key idea is to first lift the optimization problem into the infinite dimensional space, solved using nonparametric particle methods, and then project the update back to the finite dimensional parameter space that we want to optimize with. Our framework is highly general and allows us to leverage any existing particle methods as the inference engine for wild variational inference, including MCMC and Stein variational gradient methods.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

          CfP: 17th Conference on Artificial and Computational Intelligence and its Applications to the Environmental Sciences, American Meteorological Society        
Philippe just let me know of the following fascinating opportunity (deadline is August 8th but it can be extended. In that case you need to get in touch directly with him).

Hi: If you are working on Artificial Intelligence/Machine Learning Applications to Environmental Sciences, we have a terrific conference coming up in Austin Texas, January 7-11, 2018. 
We are organizing the 17th Conference on Artificial and Computational Intelligence and its Applications to the Environmental Sciences as part of the 2018 annual meeting of the American Meteorological Society. 

We have sessions in areas such as weather predictions, extreme weather, energy, climate studies, the coastal environment, health warnings, high performance computing and general artificial intelligence application sessions. 
Two of our sessions, Machine Learning and Statistics in Data Science and Climate Studies, will be headlined by invited talks. 
Several of the sessions are co-organized with other conferences providing opportunities to network with researchers and professionals in other fields. 
We also have a few firsts including sessions focused on Machine learning the Climate studies, and AI Applications to the Environment in Private Companies and Public-private Partnerships and for early health warnings. 
To submit your abstract: AI Abstracts Submission 
For more information on our sessions: AI SessionsMore information on AMS Annual meeting: Overall AMS Annual Meeting Website
See you in Austin. 
The AMS AI Committee
More information on the AMS AI committee: AMS AI Committee Web Page



Here are the  AI Sessions

  • AI Applications to the Environment in Private Companies and Public-private Partnerships. Topic Description: With the rapid development of AI techniques in meteorological and environmental disciplines, a significant amount of research is occurring in the private sector and in collaborations between companies and academia. This session will focus on AI applications in private companies and public-private partnerships, showcasing new approaches and implementations that leverage AI to help solve complex problems.
  • AI Techniques Applied to Environmental Science
  • AI Techniques for Decision Support
  • AI Techniques for Extreme Weather and Risk Assessment
  • AI Techniques for Numerical Weather Predictions
  • AI and Climate Informatics
  • Joint Session: Applications of Artificial Intelligence in the Coastal Environment (Joint between the 17th Conf on Artificial and Computational Intelligence and its Applications to the Environmental Sciences and the 16th Symposium on the Coastal Environment). Topic Description: Contributions to this session are sought in the application of AI techniques to study coastal problems including coastal hydrodynamics, beach and marsh morphology, applications of remote sensing observations and other large data sets.
  • Joint Session: Artificial Intelligence and High Performance Computing (Joint between the 17th Conf on Artificial and Computational Intelligence and its Applications to the Environmental Sciences and the Fourth Symposium on High Performance Computing for Weather, Water, and Climate)
  • Joint Session: Machine Learning and Climate Studies (Joint between the 17th Conf on Artificial and Computational Intelligence and its Applications to the Environmental Sciences and the 31st Conference on Climate Variability and Change)
  • Joint Session: Machine Learning and Statistics in Data Science (Joint between the 17th Conf on Artificial and Computational Intelligence and its Applications to the Environmental Sciences and the 25th Conference on Probability and Statistics)
  • Statistical Learning the Environmental Sciences










Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Nuit Blanche in Review (July 2017)        
Since the last Nuit Blanche in Review (June 2017), it was found that Titan had interesting chemistry. On Nuit Blanche, on the other hand, we had four implementations released by their authors, several interesting in-depth articles (some of them related to SGD and Hardware) . We had several slides and videos of meetings and schools and three job offering. Enjoy !


In-depth

SGD related

CS/ML Hardware


Slides

Videos

Job:

Other 


Credit: Northern Summer on Titan, NASA/JPL-Caltech/Space Science Institute


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Randomized apertures: high resolution imaging in far field        
Using Glitter as a way to replace large structure mirrors for space telescopes: This is what is suggested and measured here. The random PSF allows for sharper resolution (and Machine Learning is used).This is another instance of the Great Convergence, woohoo ! ( and by the way, are we going to ever acknowledge that the Random Lens Imaging paper is one of the greatest preprint that did not make it into publication, ever ?)



Randomized apertures: high resolution imaging in far field by Xiaopeng Peng, Garreth J. Ruane, Marco B. Quadrelli, and Grover A. Swartzlander
We explore opportunities afforded by an extremely large telescope design comprised of ill-figured randomly varying subapertures. The veracity of this approach is demonstrated with a laboratory scaled system whereby we reconstruct a white light binary point source separated by 2.5 times the diffraction limit. With an inherently unknown varying random point spread function, the measured speckle images require a restoration framework that combine support vector machine based lucky imaging and non-negative matrix factorization based multiframe blind deconvolution. To further validate the approach, we model the experimental system to explore sub-diffraction-limited performance, and an object comprised of multiple point sources.







Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Why is #Netneutrality Important to Data Science, Machine Learning and Artificial Intelligence ?        
Image associée

So your ISP decides the speed or the kind of service you can get based on religion or what not. What happens to our field ? Because you have spent much time on the following services, you are throttled down or have to pay for "premium" services. As a result, you may or may not get to 

  • follow Andrew Ng's Coursera or Siraj Raval classes
  • submit your Kaggle results on time
  • read ArXiv preprints
  • read the latest GAN paper on time
  • watch NIPS/ICLR/CVPR/ACL videos
  • download datasets
  • pay more to use ML/DL on the cloud
  • share reviews
  • download the latest ML/DL frameworks
  • have access to your Slack channels 
  • read Nuit Blanche
  • follow awesome DL thread on Twitter
  • get scholar google alerts
  • .... 
The rest is on Twitter


 






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          SGD, What Is It Good For ?        
Image Credit: NASA/JPL-Caltech/Space Science Institute, 
N00284488.jpg, Titan, Jul. 11, 2017 10:12 AM

As noted per the activity on the subject, there is growing interest in understanding better SGD and related methods, we mentioned two such study recently on Nuit Blanche

Sebastian Ruder updated his blog entry on the subject in An overview of gradient descent optimization algorithms (Added derivations of AdaMax and Nadam). In Reinforcement Learning or Evolutionary Strategies? Nature has a solution: BothArthur Juliani makes a mention of an insight on gradient based methods in RL (h/t Tarin for the pointer on Twitter)

It is clear that for many reactive policies, or situations with extremely sparse rewards, ES is a strong candidate, especially if you have access to the computational resources that allow for massively parallel training. On the other hand, gradient-based methods using RL or supervision are going to be useful when a rich feedback signal is available, and we need to learn quickly with less data.

But we also had people trying to speed SGD up while others put some grain of salt in the whole adaptive approach. We also have one where SGD helped by random features helps in solving the linear Bellman equation, a tool central in linear control theory. 
Deep learning thrives with large neural networks and large datasets. However, larger networks and larger datasets result in longer training times that impede research and development progress. Distributed synchronous SGD offers a potential solution to this problem by dividing SGD minibatches over a pool of parallel workers. Yet to make this scheme efficient, the per-worker workload must be large, which implies nontrivial growth in the SGD minibatch size. In this paper, we empirically show that on the ImageNet dataset large minibatches cause optimization difficulties, but when these are addressed the trained networks exhibit good generalization. Specifically, we show no loss of accuracy when training with large minibatch sizes up to 8192 images. To achieve this result, we adopt a linear scaling rule for adjusting learning rates as a function of minibatch size and develop a new warmup scheme that overcomes optimization challenges early in training. With these simple techniques, our Caffe2-based system trains ResNet-50 with a minibatch size of 8192 on 256 GPUs in one hour, while matching small minibatch accuracy. Using commodity hardware, our implementation achieves ~90% scaling efficiency when moving from 8 to 256 GPUs. This system enables us to train visual recognition models on internet-scale data with high efficiency. 


Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam. We show that for simple overparameterized problems, adaptive methods often find drastically different solutions than gradient descent (GD) or stochastic gradient descent (SGD). We construct an illustrative binary classification problem where the data is linearly separable, GD and SGD achieve zero test error, and AdaGrad, Adam, and RMSProp attain test errors arbitrarily close to half. We additionally study the empirical generalization capability of adaptive methods on several state-of-the-art deep learning models. We observe that the solutions found by adaptive methods generalize worse (often significantly worse) than SGD, even when these solutions have better training performance. These results suggest that practitioners should reconsider the use of adaptive methods to train neural networks.

We introduce a data-efficient approach for solving the linear Bellman equation, which corresponds to a class of Markov decision processes (MDPs) and stochastic optimal control (SOC) problems. We show that this class of control problem can be cast as a stochastic composition optimization problem, which can be further reformulated as a saddle point problem and solved via dual kernel embeddings [1]. Our method is model-free and using only one sample per state transition from stochastic dynamical systems. Different from related work such as Z-learning [2, 3] based on temporal-difference learning [4], our method is an online algorithm following the true stochastic gradient. Numerical results are provided, showing that our method outperforms the Z-learning algorithm


Gradient descent optimization algorithms, while increasingly popular, are often used as black-box optimizers, as practical explanations of their strengths and weaknesses are hard to come by. This article aims to provide the reader with intuitions with regard to the behaviour of different algorithms that will allow her to put them to use. In the course of this overview, we look at different variants of gradient descent, summarize challenges, introduce the most common optimization algorithms, review architectures in a parallel and distributed setting, and investigate additional strategies for optimizing gradient descent.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

          PASS Summit 2016 – Blogging again – Keynote 1        

.So I’m back at the PASS Summit, and the keynote’s on! We’re all getting ready for a bunch of announcements about what’s coming in the world of the Microsoft Data Platform.

First up – Adam Jorgensen. Some useful stats about PASS, and this year’s PASSion Award winner, Mala Mahadevan (@sqlmal)

There are tweets going on using #sqlpass and #sqlsummit – you can get a lot of information from there.

Joseph Sirosh – Corporate Vice President for the Data Group, Microsoft – is on stage now. He’s talking about the 400M children in India (that’s more than all the people in the United States, Mexico, and Canada combined), and the opportunities because of student drop-out. Andhra Pradesh is predicting student drop-out using new ACID – Algorithms, Cloud, IoT, Data. I say “new” because ACID is an acronym database professionals know well.

He’s moving on to talk about three patterns: Intelligence DB, Intelligent Lake, Deep Intelligence.

Intelligence DB – taking the intelligence out of the application and moving it into the database. Instead of the application controlling the ‘smarts’, putting them into the database provides models, security, and a number of other useful benefits, letting any application on top of it. It can use SQL Server, particularly with SQL Server R Services, and support applications whether in the cloud, on-prem, or hybrid.

Rohan Kumar – General Manager of Database Scripts – is up now. Fully Managed HTAP in Azure SQL DB hits General Availability on Nov 15th. HTAP is Hybrid Transactional / Analytical Processing, which fits really nicely with my session on Friday afternoon. He’s doing a demo showing the predictions per second (using SQL Server R Services), and how it easily reaches 1,000,000 per second. You can see more of this at this post, which is really neat.

Justin Silver, a Data Scientist from PROS comes onto stage to show how a customer of theirs handles 100 million price requests every day, responding to each one in under 200 milliseconds. Again we hear about SQL Server R Services, which pushes home the impact of this feature in SQL 2016. Justin explains that using R inside SQL Server 2016, they can achieve 100x better performance. It’s very cool stuff.

Rohan’s back, showing a Polybase demo against MongoDB. I’m sitting next to Kendra Little (@kendra_little) who is pretty sure it’s the first MongoDB demo at PASS, and moving on to show SQL on Linux. He not only installed SQL on Linux, but then restored a database from a backup that was taken on a Windows box, connected to it from SSMS, and ran queries. Good stuff.

Back to Joseph, who introduces Kalle Hiitola from Next Games – a Finnish gaming company – who created a iOS game that runs on Azure Media Services and DocumentDB, using BizSpark. 15 million installs, with 120GB of new data every day. 11,500 DocumentDB requests per second, and 43 million “Walkers” (zombies in their ‘Walking Dead’ game) eliminated every day. 1.9 million matches (I don’t think it’s about zombie dating though) per day. Nice numbers.

Now onto Intelligent Lake. Larger volumes of data than every before takes a different kind of strategy.

Scott Smith – VP of Product Development from Integral Analytics – comes in to show how Azure SQL Data Warehouse has allowed them to scale like never before in the electric-energy industry. He’s got some great visuals.

Julie Koesmarno on stage now. Can’t help but love Julie – she’s come a long way in the short time since leaving LobsterPot Solutions. She’s done Sentiment Analysis on War & Peace. It’s good stuff, and Julie’s demo is very popular.

Deep Intelligence is using Neural Networks to recognise components in images. eSmart Systems have a drone-based system for looking for faults in power lines. It’s got a familiar feel to it, based on discussions we’ve been having with some customers (but not with power lines).

Using R Services with ML algorithms, there’s some great options available…

Jen Stirrup on now. She’s talking about Pokemon Go and Azure ML. I don’t understand the Pokemon stuff, but the Machine Learning stuff makes a lot of sense. Why not use ML to find out where to find Pokemon?

There’s an amazing video about using Cognitive Services to help a blind man interpret his surroundings. For me, this is the best demo of the morning, because it’s where this stuff can be really useful.

SQL is changing the world.

@rob_farley


          You’ve been doing cloud for years...        

This month’s T-SQL Tuesday is hosted by Jeffrey Verheul (@devjef) and is on the topic of Cloud.

I seem to spend quite a bit of my time these days helping people realise the benefit of the Azure platform, whether it be Machine Learning for doing some predictions around various things (best course of action, or expected value, for example), or keeping a replicated copy of data somewhere outside the organisation’s network, or even a full-blown Internet of Things piece with Stream Analytics pulling messages off an Service Bus Event Hub. But primarily, the thing that I have to combat most of all is this:

Do I really want that stuff to be ‘out there’?

People are used to having their data, their company information, their processing, going on somewhere outside the building where they physically are.TSQL2sDay150x150

Now, there are plenty of times when organisations’ server rooms aren’t actually providing as much benefit as they expect. Conversations with people quickly help point out that their web site isn’t hosted locally (I remember in the late ‘90s a company I was at making the decision to start hosting their web site at an actual hosting provider rather than having every web request come in through the same modem as all their personal web browsing). Email servers are often the next to go. But for anyone working at home, the server room may as well be ‘the cloud’ anyway, because their data is going off to some ‘unknown’ place, with a decent amount of cabling between where they are and where their data is hosted.

Everyone’s photos are stored in ‘cloud’ already, where it be in Instagram’s repository or in something which is more obviously ‘the cloud’. Messages with people no longer just live on people’s phones, but on the servers of Facebook and Twitter. Their worries and concerns are no longer just between them and their psychiatrist, but stored on Google’s search engine web logs.

The ‘cloud’ is part of today’s world. You’re further into it than you may appreciate. So don’t be afraid, but try it out. Play with Azure ML, or with other areas of Cortana Intelligence. Put some things together to help yourself in your day-to-day activity. You could be pleasantly surprised about what you can do.

@rob_farley


          Big pharma turns to artificial intelligence to speed drug discovery, GSK signs deal        

By Ben Hirschler

LONDON (Reuters) - The world's leading drug companies are turning to artificial intelligence to improve the hit-and-miss business of finding new medicines, with GlaxoSmithKline unveiling a new $43 million deal in the field on Sunday.

Other pharmaceutical giants including Merck & Co, Johnson & Johnson and Sanofi are also exploring the potential of artificial intelligence (AI) to help streamline the drug discovery process.

The aim is to harness modern supercomputers and machine learning systems to predict how molecules will behave and how likely they are to make a useful drug, thereby saving time and money on unnecessary tests.

AI systems already play a central role in other high-tech areas such as the development of driverless cars and facial recognition software.

"Many large pharma companies are starting to realise the potential of this approach and how it can help improve efficiencies," said Andrew Hopkins, chief executive of privately owned Exscientia, which announced the new tie-up with GSK.

Hopkins, who used to work at Pfizer, said Exscientia's AI system could deliver drug candidates in roughly one-quarter of the time and at one-quarter of the cost of traditional approaches.

The Scotland-based company, which also signed a deal with Sanofi in May, is one of a growing number of start-ups on both sides of the Atlantic that are applying AI to drug research. Others include U.S. firms Berg, Numerate, twoXAR and Atomwise, as well as Britain's BenevolentAI.

"In pharma's eyes these companies are essentially digital biotechs that they can strike partnerships with and which help feed the pipeline," said Nooman Haque, head of life sciences at Silicon Valley Bank in London.

"If this technology really proves itself, you may start to see M&A with pharma, and closer integration of these AI engines into pharma R&D."

STILL TO BE PROVEN

It is not the first time drugmakers have turned to high-tech solutions to boost R&D productivity.

The introduction of "high throughput screening", using robots to rapidly test millions of compounds, generated mountains of leads in the early 2000s but notably failed to solve inefficiencies in the research process.

When it comes to AI, big pharma is treading cautiously, in the knowledge that the technology has yet to demonstrate it can successfully bring a new molecule from computer screen to lab to clinic and finally to market.

"It's still to be proven, but we definitely think we should do the experiment," said John Baldoni, GSK's head of platform technology and science.

Baldoni is also ramping up in-house AI investment at the drugmaker by hiring some unexpected staff with appropriate computing and data handling experience - including astrophysicists.

His goal is to reduce the time it takes from identifying a target for disease intervention to finding a molecule that acts against it from an average 5.5 years today to just one year in future.

"That is a stretch. But as we've learnt more about what modern supercomputers can do, we've gained more confidence," Baldoni told Reuters. "We have an obligation to reduce the cost of drugs and reduce the time it takes to get medicines to patients."

Earlier this year GSK also entered a collaboration with the U.S. Department of Energy and National Cancer Institute to accelerate pre-clinical drug development through use of advanced computational technologies.

The new deal with Exscientia will allow GSK to search for drug candidates for up to 10 disease-related targets. GSK will provide research funding and make payments of 33 million pounds ($43 million), if pre-clinical milestones are met.

($1 = 0.7682 pounds)

(Reporting by Ben Hirschler; Editing by Adrian Croft/Keith Weir)


          The Strange Loop 2013        

This was my second time at The Strange Loop. When I attended in 2011, I said that it was one of the best conferences I had ever attended, and I was disappointed that family plans meant I couldn't attend in 2012. That meant my expectations were high. The main hotel for the event was the beautiful DoubleTree Union Station, an historic castle-like building that was once an ornate train station. The conference itself was a short walk away at the Peabody Opera House. Alex Miller, organizer of The Strange Loop, Clojure/West, and Lambda Jam (new this year), likes to use interesting venues, to make the conferences extra special.

I'm providing a brief summary here of what sessions I attended, followed by some general commentary about the event. As I said last time, if you can only attend one conference a year, this should be the one.

  • Jenny Finkel - Machine Learning for Relevance and Serendipity. The conference kicked off with a keynote from one of Prismatic's engineering team talking about how they use machine learning to discover news and articles that you will want to read. She did a great job of explaining the concepts and outlining the machinery, along with some of the interesting problems they encountered and solved.
  • Maxime Chevalier-Boisvert - Fast and Dynamic. Maxime took us on a tour of dynamic programming languages through history and showed how many of the innovations from earlier languages are now staples of modern dynamic languages. One slide presented JavaScript's take on n + 1 for various interesting values of n, showing the stranger side of dynamic typing - a "WAT?" moment.
  • Matthias Broecheler - Graph Computing at Scale. Matthias opened his talk with an interesting exercise of asking the audience two fairly simple questions, as a way of illustrating the sort of problems we're good at solving (associative network based knowledge) and not so good at solving (a simple bit of math and history). He pointed out the hard question for us was a simple one for SQL, but the easy question for us would be a four-way join in SQL. Then he introduced graph databases and showed how associative network based questions can be easily answered and started to go deeper into how to achieve high performance at scale with such databases. His company produces Titan, a high scale, distributed graph database.
  • Over lunch, two students from Colombia told us about the Rails Girls initiative, designed to encourage more young women into the field of technology. This was the first conference they had presented at and English was not their native language so it must have been very nerve-wracking to stand up in front of 1,100 people - mostly straight white males - and get their message across. I'll have a bit more to say about this topic at the end.
  • Sarah Dutkiewicz - The History of Women in Technology. Sarah kicked off the afternoon with a keynote tour through some of the great innovations in technology, brought to us by women. She started with Ada Lovelace and her work with Charles Babbage on the difference engine, then looked at the team of women who worked on the ENIAC, several of whom went on to work on UNIVAC 1. Admiral Grace Hopper's work on Flow-Matic - part of the UNIVAC 1 project - and subsequent work on COBOL was highlighted next. Barbara Liskov (the L in SOLID) was also covered in depth, along with several others. These are good role models that we can use to encourage more diversity in our field - and to whom we all owe a debt of gratitude for going against the flow and marking their mark.
  • Evan Czaplicki - Functional Reactive Programming in Elm. This talk's description had caught my eye a while before the conference, enough so that I downloaded Elm and experimented with it, building it from source on both my Mac desktop and my Windows laptop, during the prerelease cycle of what became the 0.9 and 0.9.0.2 versions. Elm grew out of Evan's desire to express graphics and animation in a purely functional style and has become an interesting language for building highly interactive browser-based applications. Elm is strongly typed and heavily inspired by Haskell, with an excellent abstraction for values that change over time (such as mouse position, keyboard input, and time itself). After a very brief background to Elm, Evan live coded the physics and interaction for a Mario platform game with a lot of humor (in just 40 lines of Elm!). He also showed how code updates could be hot-swapped into the game while it was running. A great presentation and very entertaining!
  • Keith Adams - Taking PHP Seriously. Like CFML, PHP gets a lot of flak for being a hot mess of a language. Keith showed us that, whilst the criticisms are pretty much all true, PHP can make good programmers very productive and enable some of the world's most popular web software. Modern PHP has traits (borrowed from Scala), closures, generators / yield (inspired by Python and developed by Facebook). Facebook's high performance "HipHop VM" runs all of their PHP code and is open source and available to all. Facebook have also developed a gradual type checking system for PHP, called Hack, which is about to be made available as open source. It was very interesting to hear about the pros and cons of this old warhorse of a language from the people who are pushing it the furthest on the web.
  • Chiu-Ki Chan - Bust the Android Fragmentation Myth. Chiu-Ki was formerly a mobile app developer at Google and now runs her own company building mobile apps. She walked us through numerous best practices for creating a write-once, run-anywhere Android application, with a focus on various declarative techniques for dealing with the many screen sizes, layouts and resolutions that are out there. It was interesting to see a Java + XML approach that reminded me very much of Apache Flex (formerly Adobe Flex). At the end, someone asked her whether similar techniques could be applied to iOS app development and she observed that until very recently, all iOS devices had the same aspect ratio and same screen density so, with auto-layout functionality in iOS 6, it really wasn't much of an issue over in Apple-land.
  • Alissa Pajer - Category Theory: An Abstraction for Everything. In 2011, the joke was that we got category theory for breakfast in the opening keynote. This year I took it on by choice in the late afternoon of the first day! Alissa's talk was very interesting, using Scala's type system as one of the illustrations of categories, functors, and morphisms to show how we can use abstractions to apply knowledge of one type of problem to other problems that we might not recognize as being similar, without category theory. Like monads, this stuff is hard to internalize, and it can take many, many presentations, papers, and a lot of reading around the subject, but the abstractions are very powerful and, ultimately, useful.
  • Jen Myers - Making Software Development Make Sense For Everyone. Closing out day one was a keynote by Jen Myers, primarily known as a designer and front end developer, who strives to make the software process more approachable and more understandable for people. Her talk was a call for us all to help remove some of the mysticism around our work and encourage more people to get involved - as well as to encourage people in the software industry to grow and mature in how we interact. As she pointed out, we don't really want our industry to be viewed through the lens of movies like "The Social Network" which makes developers look like assholes!.
  • Martin Odersky - The Trouble with Types. The creator of Scala started day two by walking us through some of the commonly perceived pros and cons of both static typing and dynamic typing. He talked about what constitutes good design - discovered, rather than invented - and then presented his latest work on type systems: DOT and the Dotty programming language. This collapses some of the complexities of parameterized types (from functional programming) down onto a more object-oriented type system, with types as abstract members of classes. Compared to Scala (which has both functional and object-oriented types), this provides a substantial simplification without losing any of the expressiveness, and could be folded into "Scala.Next" if they can make it compatible enough. This would help remove one of the major complaints against Scala: the complexity of its type system!
  • Mridula Jayaraman - How Developers Treat Ovarian Cancer. I missed Ola Bini's talk on this topic at a previous conference so it was great to hear one of his teammates provide a case study on this fascinating project. ThoughtWorks worked with the Clearity Foundation and Annai Systems - a genomics startup - to help gather and analyze research data, and to automate the process of providing treatment recommendations for women with ovarian cancer. She went over the architecture of the system and (huge!) scale of the data, as well as many of the problems they faced with how "dirty" and unstructured the data was. They used JRuby for parsing the various input data and Clojure for their DSLs, interacting with graph databases, the recommendation engine and the back end of the web application they built.
  • Crista Lopes - Exercises in Style. Noting that art students are taught various styles of art, along with analysis of those styles, and the rules and guidelines (or constraints) of those styles, Crista observed that we have no similar framework for teaching programming styles. The Wikipedia article on programming style barely goes beyond code layout - despite referencing Kernighan's "Elements of Programming Style"! She is writing a book called "Exercises in Programming Style", due in Spring 2014 that should showcase 33 styles of programming. She then showed us a concordance program (word frequencies) in Python, written in nine different styles. The code walkthrough got a little rushed at the end but it was interesting to see the same problem solved in so many different ways. It should be a good book and it will be educational for many developers who've only been exposed to one "house" style in the company where they work.
  • Martha Girdler - The Javascript Interpreter, Interpreted. Martha walked us through the basics of variable lookups and execution contexts in JavaScript, explaining variable hoisting, scope lookup (in the absence of block scope) and the foibles of "this". It was a short and somewhat basic preso that many attendees had hoped would be much longer and more in depth. I think it was the only disappointing session I attended, and only because of the lack of more material.
  • David Pollak - Getting Pushy. David is the creator of the Lift web framework in Scala that takes a very thorough approach to security and network fallibility around browser/server communication. He covered that experience to set the scene for the work he is now doing in the Clojure community, developing a lightweight push-based web framework called Plugh that leverages several well-known Clojure libraries to provide a seamless, front-to-back solution in Clojure(Script), without callbacks (thanks to core.async). Key to his work is the way he has enabled serialization of core.async "channels" so that they can be sent over the wire between the client and the server. He also showed how he has enabled live evaluation of ClojureScript from the client - with a demo of a spreadsheet-like web app that you program in ClojureScript (which is round-tripped to the server to be compiled to JavaScript, which is then evaluated on the client!).
  • Leo Meyerovich - Thinking DSLs for Massive Visualization. I had actually planned to attend Samantha John's presentation on Hopscotch, a visual programming system used to teach children to program, but it was completely full! Leo's talk was in the main theater so there was still room in the balcony and it was an excellent talk, covering program synthesis and parallel execution of JavaScript (through a browser plugin that offloads execution of JavaScript to a specialized VM that runs on the GPU). The data visualization engine his team has built has a declarative DSL for layout, and uses program synthesis to generate parallel JS for layout, regex for data extraction, and SQL for data analysis. The performance of the system was three orders of magnitude faster than a traditional approach!
  • Chris Granger - Finding a Way Out. Some of you may have been following Chris's work on LightTable, an IDE that provides live code execution "in place" to give instant feedback as you develop software. If you're doing JavaScript, Python, or Clojure(Script), it's worth checking out. This talk was more inspirational that product-related (although he did show off a proof of concept of some of the ideas, toward the end). In thinking about "How do we make programming better?" he said there are three fundamental problems with programming today: it is unobservable, indirect, and incidentally complex. As an example, consider person.walk(), a fairly typical object-oriented construct, where it's impossible to see what is going on with data behind the scenes (what side effects does it have? which classes implement walk()?). We translate from the problem domain to symbols and add abstractions and indirections. We have to deal with infrastructure and manage the passage of time and the complexities of concurrency. He challenged us that programming is primarily about transforming data and posited a programming workflow where we can see our data and interactively transform it, capturing the process from end to end so we can replay it forwards and backwards, making it directly observable and only as complex as the transformation workflow itself. It's an interesting vision, and some people are starting to work on languages and tools that help move us in that direction - including Chris with LightTable and Evan with Elm's live code editor - but we have a long way to go to get out of the "tar pit".
  • Douglas Hofstadter, David Stutz, a brass quintet, actors, and aerialists - Strange Loops. The two-part finale to the conference began with the author of "Gödel, Escher, and Bach" and "I am a Strange Loop" talking about the concepts in his books, challenging our idea of perception and self and consciousness. After a thought-provoking dose of philosophy, David Stutz and his troope took to the stage to act out a circus-themed musical piece inspired by Hofstadter's works. In addition to the live quintet, Stutz used Emacs and Clojure to provide visual, musical, and programmatic accompaniment. It was a truly "Strange" performance but somehow very fitting for a conference that has a history of pushing the edges of our thinking!

Does anything unusual jump out at you from the above session listing? Think about the average technical conference you attend. Who are the speakers? Alex Miller and the team behind The Strange Loop made a special effort this year to reach out beyond the "straight white male" speaker community and solicit submissions from further afield. I had selected most of my schedule, based on topic descriptions, before it dawned on me just how many of the speakers were women: over half of the sessions I attended! Since I didn't recognize the vast majority of speaker names on the schedule - so many of them were from outside the specific technical community I inhabit - I wasn't really paying any attention to the names when I was reading the descriptions. The content was excellent, covering the broad spectrum I was expecting, based on my experience in 2011, with a lot of challenging and fascinating material, so the conference was a terrific success in that respect. That so many women in technology were represented on stage was an unexpected but very pleasant surprise and it should provide an inspiration to other technology conferences to reach beyond their normal pool of speakers too. I hope more conferences will follow suit and try to address the lack of diversity we seem to take for granted!

I already mentioned the great venues - both the hotel and the conference location - but I also want to call out the party organized at the St Louis City Museum for part of the overall "wonder" of the experience that was The Strange Loop 2013. The City Museum defies description. It is a work of industrial art, full of tunnels and climbing structures, with a surprise around every corner. Three local breweries provided good beer, and there was a delicious range of somewhat unusual hot snacks available (bacon-wrapped pineapple is genius - that and the mini pretzel bacon cheeseburgers were my two favorites). It was quiet enough on the upper floors to talk tech or chill out, while Moon Hooch entertained loudly downstairs, and the outdoor climbing structures provided physical entertainment for the adventurous with a head for heights (not me: my vertigo kept me on the first two stories!).

In summary then, the "must attend" conference of the year, as before! Kudos to Alex Miller and his team!


          Detecting emotion with Machine Learning        
Machine Learning is a very hot topic these days. Getting started can be fast and easy. In this video post, I walk through the steps to build a simple Universal Windows Application (UWP) that connects to the Microsoft Cognitive Services and the Emotion API. The Microsoft Cognitive Services are a set of APIs that enable...
          Step by Step Machine Learning: A Classification model to help the Humane Society        
This tutorial will walk you through a classification machine learning experiment to predict one of several possible outcomes for an animal brought to the humane society. If you missed the previous tutorials you can find them here Predicting survivors on the titanic (two-class prediction) Analyzing breast cancer data (two-class prediction) For a more comprehensive introduction...
          Bir Cesaret Örneği: Rhodeus Script ve Talha Zekeriya Durmuş yazısına Emine tarafından yapılan yorumlar        
Umarım başarılarının devamı gelir Talha'nın. Yaptığı gerçekten büyük cesaaret isteyen bir şey. Tübitak jürisi gerçekten saçmalayabiliyor. Biz başka bir yarışmaya machine learning kullandığımız bir projeyle katılmıştık ve yaptığımız bir tanım üzerine saçma sapan bir soru sordular. Böyle aksilikler başına gelmez inş.
          Model Selection: Underfitting, Overfitting, and the Bias-Variance Tradeoff        
In machine learning and pattern recognition, there are many ways (an infinite number, really) of solving any one problem. Thus it is important to have an objective criterion for assessing the accuracy of candidate approaches and for selecting the right model for a data set at hand. In this post we’ll discuss the concepts of […]
          Discover 7 new Microsoft MCSA and MCSE certifications         
Microsoft have announced the launch of 6 new MCSA certifications and 1 new MCSE certification. This demonstrates Microsoft’s commitment to a growing Azure, Big Data, Business Intelligence (BI) and Dynamics community.

These new certifications and courses will support Microsoft partners looking to upskill and validate knowledge in these technologies.  

Following the huge changes announced in September, these new launches will simplify your path to certification. They'll minimise the number of steps required to earn a certification, while allowing you to align your skills to industry-recognised areas of competence.

This blog will outline the new certifications Microsoft have announced, focusing on the technologies, skills and job roles they align to. 

So what's new?


MCSA: Microsoft Dynamics 365

This MCSA: Microsoft Dynamics 365 certification is one of three Dynamics 365 certifications launched. It demonstrates your expertise in upgrading, configuring and customising the new Microsoft Dynamics 365 platform.

There are currently no MOCs aligned to this certification. We have developed our own Firebrand material that will prepare you for the following two exams needed to achieve this certification:
  • MB2-715: Microsoft Dynamics 365 customer engagement Online Deployment
  • MB2-716: Microsoft Dynamics 365 Customization and Configuration 
This certification will validate you have the skills for a position as a Dynamics 365 developer, implementation consultant, technical support engineer or system administrator.

This certification is a prerequisite for the MCSE: Business Applications. 

MCSA: Microsoft Dynamics 365 for Operations

The second of these three Dynamics 365 certs is the MCSA: Microsoft Dynamics 365 for Operations. Here, you’ll get the skills to manage a Microsoft SQL Server database and customise Microsoft Dynamics 365.

On this course, you’ll cover the following MOC:
  • 20764: Administering a SQL Database Infrastructure 
The second part of this course, of which there is currently no MOC, will cover Firebrand's own material. 

To achieve this certification you’ll need to pass the following exams:
  • 70-764: Administering a SQL Database Infrastructure
  • MB6-890: Microsoft Development AX Development Introduction 
Earning this cert proves you have the technical competence for positions such as Dynamics 365 developer, solutions architect or implementer.  

Just like the MCSA: Microsoft Dynamics 365, this certification is also a prerequisite to the new MCSE: Business Applications certification. 

MCSE: Business Applications

Earning an MCSE certification validates a more advanced level of knowledge. The MCSE: Business Applications certification proves an expert-level competence in installing, operating and managing Microsoft Dynamics 365 technologies in an enterprise environment.

In order to achieve this certification you’ll be required to pass either the MCSA: Microsoft Dynamics 365 or the MCSA: Microsoft Dynamics 365 for Operations. You’ll also be required to choose one of the following electives to demonstrate expertise on a business-specific area:
  • MB2-717: Microsoft Dynamics 365 for Sales
  • MB2-718: Microsoft Dynamics 365 for Customer Service
  • MB6-892: Microsoft Dynamics AX - Distribution and Trade
  • MB6-893: Microsoft Dynamics AX - Financials  
Earning your MCSE: Business Applications certification will qualify you for the roles such as Dynamics 365 developer, implementation consultant, technical support engineer, or system administrator.

MCSA: Big Data Engineering

This MCSA: Big Data Engineering certification demonstrates you have the skills to design and implement big data engineering workflows with the Microsoft cloud ecosystem and Microsoft HD Insight to extract strategic value from your data.

On this course you’ll cover the following MOCs:
  • 20775A: Perform Data Engineering on Microsoft HDInsight – expected 28/6/2017
  • 20776A: Engineering Data with Microsoft Cloud Services – expected 08/2017
And take the following exams:
  • 70-775: Perform Data Engineering on Microsoft HD Insight – available now in beta
  • 70-776: Engineering Data with Microsoft Cloud Services – expected Q1 2018
This course is aimed at data engineers, data architects, data scientists and data developers.

Earning this MCSA acts as a prerequisite, and your first step, to achieving the MCSE: Data Management and Analytics credential.

MCSA: BI Reporting

This MCSA: BI Reporting certification proves your understanding of data analysis using Power BI. You’ll learn the skills to create and manage enterprise business intelligence solutions.

The MOCs you’ll cover on this course include:
  • 20778A: Analyzing Data with Power BI
  • 20768B: Developing SQL Data Models 
In order to achieve the certification, you’ll take the following exams:
  • 70-778: Analyzing Data with Power BI - expected Q1 2018
  • 70-768: Developing SQL Data Models 
This certification is aimed at database professionals needing to create enterprise BI solutions and present data using alternative methods.

This certification is a prerequisite for the MCSE: Data Management and Analytics credential. 

MCSA: Cloud Database Development 

This MCSA: Cloud Database Development certification will prove you have the skills to build and implement NoSQL solutions with DocumentDB and Azure Search for the Azure data platform

This certification covers the following MOCs:
  • 40441: Designing and Implementing Cloud Data Platform Solutions
  • 20777: Implementing NoSQL Solutions with DocumentDB and Azure Search – expected in August 2017 
In order to achieve the certification, you'll have to pass the following exams: 
  • 70-473: Designing and Implementing Cloud Data Platform Solutions
  • 70-777: Implementing NoSQL Solutions with DocumentDB and Azure Search – expected in Q1 2018
This course is aimed at specialist professionals looking to validate their skills and knowledge of developing NoSQL solutions for the Azure data platform. 

This certification is also a prerequisite certification to the MCSE: Data Management and Analytics credential. 

MCSA: Data Science

This course will teach you the skills in operationalising Microsoft Azure machine learning and Big Data with R Server and SQL R Services. You'll learn to process and analyse large data sets using R and use Azure cloud services to build and deploy intelligent solutions.

This certification covers the following MOCs:
  • 20773A: Analyzing Big Data with Microsoft R – in development, expected May 2017
  • 20774A: Perform Cloud Data Science with Azure Machine Learning – in development, expected June 2017
To achieve this certification you’ll be required to pass the following exams:
  • 70-773: Analyzing Big Data with Microsoft R – available now in beta
  • 70-774: Perform Cloud Data Science with Azure Machine Learning – available now in beta 
This certification, which is your first step to the MCSE: Data Management and Analytics cert is best suited to data science or data analyst job roles. 


          SXSW Twitter Panels        

There has always been a strong connection between Twitter and South by Southwest, since Twitter took the festival by storm in 2007 and won the web award. Four years later, the relationship is still as strong as ever, both as a community platform and as a subject.  There are 43 twitter related panels that are currently proposed for SXSWi 2011!  All listed below with descriptions from the panelpicker, certainly something for everyone. Voting ends at 11:59 CDT on Friday, August 27.

 

 

Teaming Up On Twitter Is Great For Business

Kendall Morris, Fahrenheit
We’ve seen Twitter grow from infancy to a major player in 4 short years. Where is it going next a... READ MORE We’ve seen Twitter grow from infancy to a major player in 4 short years. Where is it going next and how can businesses get the most from it? When a big business stakes a claim in the Twitterverse it can quickly become a full time job for one or more people to manage the conversations, content, relationships and resolutions. Twitter teams are often a solution but learning how to manage the team can be a big challenge. Identifying the right approach for your business is critical to the success of your program. Is it better to have one handle with multiple people behind it or many handles with common branding? Twitter has been beta testing a feature called “Contributors” that could be a boon to businesses. It allows for the benefits of individual profiles as well as a unified branded voice. There is also great potential to marry “Contributors” with “Places” for a customer experience that is personalized and highly relevant to the consumers needs. Tools developed by Hootsuite and Co-Tweet have team oriented capabilities that help develop a unified voice for your brand as well as the ability to manage multiple people working towards a common goal. Leveraging the resources available can really gain a big return for businesses looking to use Twitter to their best advantage.
Branding / Marketing / Publicity Business Strategy, twitter

 

Building with Twitter – How to Dominate the API

Vishal Sankhla, Viralheat
Few API’s spew out as much data as Twitter’s does, and few come anywhere close to its popularity... READ MORE Few API’s spew out as much data as Twitter’s does, and few come anywhere close to its popularity with developers. But it’s not a walk in the park. It takes a lot of careful design and experience to build apps that please users even when Twitter is overloaded or the API’s limitations get in your way. In this panel, we'll talk about lessons from developers in the field who have tapped into Twitter’s API successfully. The panelists will share their technical and strategic tips for how to build applications with the API that perform consistently, reliably and innovate beyond basic uses. If you’re thinking about using the Twitter API for the first time or are a seasoned pro – this panel will be an insightful discussion about the techniques and strategies that help you make the most of it.
Web Apps / Widgets Application development, how-to, twitter

How to Revolutionize Healthcare with Clever Twitter Applications

Tal Friedman, lifeaftercancer.wordpress.com
Want to learn how to use Twitter as a game-changer in healthcare? Find out how we can go way beyond... READ MORE Want to learn how to use Twitter as a game-changer in healthcare? Find out how we can go way beyond just tweeting at, following and friending doctors, nurses, patients and caregivers and actually BUILD applications and communities to make a quantum leap forward in how we communicate our healthcare needs, share tips and gather information we can trust. I'll show you sample Twitter applications that make clever use of lists, private accounts and even Twitter bots to revolutionize healthcare in the 21st century.
Health applications, Health , twitter

 

Secrets of Fake Twitter Accounts Revealed (maybe)

Jasper Slobrushe, @JasperSlobrushe
The most dead-on social commentary of the BP gulf oil spill came in the form of a parody Twitter acc... READ MORE The most dead-on social commentary of the BP gulf oil spill came in the form of a parody Twitter account—who would have thought? Much has been made of the potential for social media to promote brands, but what if you don't have one? Many folks haven't let that stop them, either inventing or taking on the persona of an existing company or public figure. What's the point? The panelists will discuss exactly that.
Social Networking fake, parody, twitter

Pets, They Are Atwitter

Sloane Kelley, BFG Communications
Thousands of pets, animal owners and businesses make up an ever-growing and active niche community o... READ MORE Thousands of pets, animal owners and businesses make up an ever-growing and active niche community on Twitter, where some have even gone so far as to create Twitter personalities for their pets. This group of “anipals” (as they’re known on Twitter) and their owners are in constant communication, entertaining one another, building relationships and sharing information. What other Twitter communities can boast a band (The Shibbering Cheetos @ShibberingC), a newspaper complete with advice columnists (The Anipal Times @anipaltimes), and even a suave government spy (@JamesBondTheDog). The group also uses Twitter to organize virtual and real world events, many of which have a charitable tie-in. Tens of thousands of dollars have been raised for animal-based charities thanks to this very passionate community and its series of “pawties.” For years, the web has made it easier for niche groups like this to communicate but social media and Twitter, in particular, have made it even easier and more fun. This session will take you behind the curtain into Twitter’s animal community, its organic rise and the people who’ve made it happen. We’ll also talk about what other niche groups can learn from “anipals” on building community and making a difference.
Community / Online Community Charity, pets, twitter

Using Twitter to Improve College Student Engagement

Reynol Junco, Lock Haven University
While faculty and staff at higher education institutions have experimented with the use of social me... READ MORE While faculty and staff at higher education institutions have experimented with the use of social media, there has not been a concerted effort to integrate these technologies in educationally-relevant ways. Emerging research in the field of social media, student engagement, and success shows that there are specific ways that these technologies can be used to improve educational outcomes. This presentation will focus on reviewing and translating research on the effects of Twitter on college students into effective and engaging educational practices. Background research on the psychological construct of engagement will be provided and will be linked to engagement in online social spaces. In addition to presenting cutting-edge research on how to create engaging and engaged communities, the presenter will review specific ways that Twitter can be used in the classroom and the co-curriculum. The presenter will discuss how academicians can hack existing technologies, specifically Twitter, for educational good and will present the results of his latest research on the effects of Twitter on student engagement and grades.
Education Engagement, Higher Education, social media

Building Relationships -- and Revenues -- Through Twitter

Justin Goldsborough, Fleishman-Hillard
10 ways to profit off of Twitter participation without a book deal, sponsorship or selling your star... READ MORE 10 ways to profit off of Twitter participation without a book deal, sponsorship or selling your startup. We're all asked to justify the ROI of our time spent on Twitter. Some people land book deals. Others get sponsors. But the rest of us can also make a profit off our tweets. Anyone can do it, and you don't need 60,000 followers. This presentation will show you how to increase billings, drive new business, get better customer service and even find a job in 140 characters or less. The presenters co-moderate #pr20chat on Twitter, a weekly conversation about technology's influence on communication. We can directly tie increased business revenues to our participation in the chat and engagement in social media.
Entrepreneurism / Monetization #pr20chat, Social Media ROI, twitter

Startup Marketing: It's More Then a Twitter Account

Saul Colt, Saul.is
Have you ever looked at the Sky Mall catalog and say to yourself "How have I gone this long without ... READ MORE Have you ever looked at the Sky Mall catalog and say to yourself "How have I gone this long without an ankle air conditioner?" Probably not but the fact is that there are a lot of start ups and more will be formed before you finish reading this description. It's getting harder and harder to get the attention you need to prosper and reach your prospective audience but that doesn't mean it is impossible! This talk is going to cover a few tricks and tips for actually marketing your business from a start up perspective and you may be surprised that this talk isn't going to be all about Social Media!
Branding / Marketing / Publicity marketing, Startup, twitter

 

Twittering with Bedouin; Delivering Social Media to All

Darrin Husmann, lalaOKC
This presentation will address the social and technical issues surrounding the development of a semi... READ MORE This presentation will address the social and technical issues surrounding the development of a semi-ubiquitous social networking platform. It encourages participation by understanding participant motives, culture and technology. Current social media platforms are limited in their ability to decrease opportunity costs and provide value by requiring specific platforms, interfaces, access to technology, expectations of a certain level of education (or knowledge) and/or cultural outlook. Many of the social media platforms use the influence of market exclusivity to create demand; effectively omitting large segments of users by design. I will discuss and demonstrate an application that seeks to dramatically expand the availability of social networking, across limits in technology, cultural changes and demographics. This application will unify emerging & current technology with third world communication networking platforms. The goal is not to provide everyone with HTML 5, it is to effect real-time communication across a larger segment of humanity. This application will integrate intelligence and social media platforms in a manner that is holistic with the consumers tastes. The efficacy of this program will be demonstrated/discussed with coupons, sms and a micro business. It will be fun.
Accessibility Culture, marketing, social media

 

Twittersex: Tweet Me for a Good Time. #oxytocin

Jennie Chen, Chenergy Consulting
The internet isn't just the information highway anymore. The internet is a place where relationships... READ MORE The internet isn't just the information highway anymore. The internet is a place where relationships can start and communities are built. With the growing popularity of social media, it is increasingly socially acceptable to make relationships in real life with people we meet online. Online dating is no longer taboo, though many people still feel that a genuine connection and chemical spark can't be started merely through online interactions. Cybersex is one way that the human body can be sexually aroused through online interaction, and now there's growing evidence that biological changes related to emotional bonding (not just sexual pleasure) can be stimulated through online interactions. Oxytocin, known as the "cuddle hormone", is released during bonding moments and social support, which includes physical touching. There is now evidence that oxytocin is released when interacting with others via Twitter, perhaps with users that one has not yet met in real life. This shows that intimate relationships, both emotional and physical (in the form of foreplay and sexual arousal), can be started online. However, there is more to getting the hormonal relationship juices to flow than just flirting on the internet. We'll explore how social interactions can release various types of hormones. Learn how that translates into developing romantic and businesses relationships using social media tools.
Online Relationships hormones, Relationships, social media

 

Using Twitter and related technologies as research tools

Elizabeth Winkler, University of Texas at Austin
Our goal is to present new research techniques which the academic world has developed that can benef... READ MORE Our goal is to present new research techniques which the academic world has developed that can benefit the industry in areas such as market research, advertising, and consumer relations management. Blending together computational linguistics, natural language processing, computer science, marketing, and advertising the panelists have managed to use Twitter and other social media for research. The results, and the processes leading to them can greatly benefit the industry. The panelists are world renowned professors know to be at the forefront of technology as well as the Online Managing Editor of the Massachusetts' Institute of Technology (MIT)’s Sloan Management Review, a publication targeting top industry executives.
Social Networking machine learning, market research, natural language processing

 

Twitter Killed Christmas & Other Social Media Myths Dispelled

Vikki Chowney, Reputation Online
This session will look at ways in which various companies that have seen customer service problems t... READ MORE This session will look at ways in which various companies that have seen customer service problems turn into serious online crisis might have responded and how social media could then have succeeded in turning a negative into a positive. Many 'social media fails' turn out to be nothing to do with the new technologies shaping this young industry, but a lack of response to customer complaints. Here at Reputation Online, we've become increasingly frustrated with the sensationalism surrounding some of these situations and want to start talking about the real cause of so many issues - customer service. The past 12 months has seen the Twitter-obsessed social media industry stuck in a kind of never-ending loop, in which the same mistakes are being made by different brands. In the UK we've seen high-street stationer Paperchase accused (but subsequently deny) plagiarism of a designer's work before denouncing the situation as proof of the 'dangers of Twitter'. Eurostar faced criticism for not using Twitter during Christmas 2010 when people were stuck on a train somewhere between London and Paris, and even Nokia failed to spot a filmmaker's complaint in the midst of a copyright debacle. Editor of Reputation Online - Vikki Chowney - will re-imagine some of the high-profile 'fails' that have seen their fair share of negative press as if customer service had been established as the primary objective of the brands involved. We're sure we'll have Tony Hsieh's vote on this one at least.
Online Relationships customer service, reputation, social media fail

 

Making Money on Twitter: The Story of @DellOutlet

Ricardo Guerrero, Social Media Dynamo
Dell is the case study most often sited regarding successful sales strategies via Twitter. In Decemb... READ MORE Dell is the case study most often sited regarding successful sales strategies via Twitter. In December 2009, Dell released numbers indicating it had made $6.5M dollars since starting their Twitter sales accounts 2.5 years earlier. Specifically, the Dell Outlet started them on this path with the launch of the @DellOutlet account in June 2007, and it was predominantly @DellOutlet that drove those sales and established a model for use by other Dell departments and regions. Although a lot has been written in blogs about this case study, much of it has been opinion pieces and has rarely, if ever, included the perspective of the founders of @DellOutlet. The purpose of this panel is to highlight both the history of this Twitter success story from the perspective of the individuals who came up with the approach and helped grow it as well as sharing lessons learned that can be applied at other organizations, large and small. Ample time will be allotted for questions and dialogue with (and hopefully amongst) attendees of this session.
Social Networking Dell, twitter

 

OMG, My Customer’s Pissed and Uses Twitter

Rob LaGesse, www.rackspace.com
Too many customers are sitting listening to hold music waiting for their problem to get resolved. In... READ MORE Too many customers are sitting listening to hold music waiting for their problem to get resolved. Instead of stewing privately they are now airing their grievances publicly. To anyone and everyone that will listen. The BP oil spill and Toyota recalls have showed us how people are using social media tools to give pissed off customers a new voice – and it’s a megaphone. Knowing your customer and understanding how to address everything from a crisis to the everyday question quickly and effectively is critical. Learn about some of the biggest flubs from 2010, how the ball was dropped and what could have been done differently. Don’t make the same mistakes they did. Learn how not to mess up.
Social Networking customer service

 

Follow Me - Using Twitter to Save Lives

DJ Edgerton, www.pixelsandpills.com
Could you save a life in 140 characters? That was the challenge put to the development team at Zemog... READ MORE Could you save a life in 140 characters? That was the challenge put to the development team at Zemoga, a leading interactive agency. Using the Twitter API they created Follow Me, a Twitter app that connects patients, doctors and caregivers. While many pharma and healthcare companies have grappled with how best to use social media, firms like Zemoga have taken it to the next level by focusing on the patient first. Follow Me lets disease state sufferers update physicians, family members and other caregivers on their health states as easily as tweeting about Justin Bieber or last night's baseball game. An easy to use interface let's them select an emotional or physical state and it's sent out to a private Twitter network made up of followers that have been authorized by the patient. Doctors can view all of their patients statuses through a customized dashboard and follow up with the ones who've expressed a negative emotional or physical state. They can ask questions about physical conditions, compliance with drug prescriptions, and other highly relevant and personal subjects. Family members and caregivers can also check in, monitoring conditions and sending reminders to patients about diet, compliance or other healthcare related matters. While a Follow Me demo will form the heart of the presentation, we want to encourage a discussion about how pharma/healthcare can move beyond the current "mass market" approach to patient communication and engage individuals using social media.
Health Health , social media, twitter

 

Crowdsourcing Academe: A Twittered Panel

Virignia Kuhn, Institute for Multimedia Liteary, University of Southern California
Social media and newer Information and Communication Technologies have begun to shape the slow-movin... READ MORE Social media and newer Information and Communication Technologies have begun to shape the slow-moving culture of higher education in vital ways. In seven days scholars from across the globe crowdsourced a 'book' submitting work via Twitter and blog posts. The project, titled Hacking the Academy, was begun Tom Scheinfeldt and Dan Cohen as a project of the Center for History and New Media, George Mason University. Although this project is not unprecedented--indeed in 2003 academics at the Computers and Writing Conference at Purdue assembled a digital book in an afternoon--but it is a remarkable both for its reach and its use of Web 2.0 tools designed for social networking as authors submitted from disparate geographic locales. Such projects beg pressing questions about the viability of the way that universities conduct the business of education and knowledge production in the 21st century. This panel will examine the ways in which digital technologies challenge disciplinary, professional and pedagogical boundaries. We have already crowdsourced participation in this panel by tweeting a call for interest through the Hacking the Academy hashtag and will include a simultaneous vibrant Twitter discussion which our remote international colleagues will facilitate.
Education

 

Exploring the Twitter APIs

Matt Harris, Twitter
This panel will cover the recently released and popular features of the Twitter API and explore crea... READ MORE This panel will cover the recently released and popular features of the Twitter API and explore creative ways they have been used. We'll discuss the developments over the past year and what you can expect from the API team in the future. We'll also be sharing some stories about how some of these new features came to be and reveal some of the challenges we had to overcome to release them. The panel will respond to a selection of questions received before SXSW and open up for audience questions as well.
Web Apps / Widgets API, twitter, twitterapi

 

Geppetto's Army: Creating International Incidents with Twitter Bots

Greg Marra, Google
Twitter has proven to be an invaluable tool for communication during intense periods of political un... READ MORE Twitter has proven to be an invaluable tool for communication during intense periods of political unrest and social suppression. When thousands of people tweet about oppressive regimes and violence against protesters, the outside world gets a chance to understand events on the ground. But what if none of those thousands of people were real, and the events never happened? Previous research has shown that Twitter bots can build up a following, garnering hundreds of emotionally invested followers who are fooled into believing the bots are real. A single puppetmaster could create hundreds of Twitter bots, letting them live perfectly normal and believable lives for months while they build up followers. Then one day, a careful crafted false story unfolds on the stage of social media, played out by a single director with hundreds of actors. Incidents like Balloon Boy demonstrate that powerful stories can become widespread before there is time for fact checking. Before anyone realizes all the TwitPics of the massacre are faked, the fake event will have made international headlines. This presentation will discuss the technical feasibility of such an attack on the global media infrastructure and discuss the implications of a news system that trusts "recent" over "reputable".
Journalism journalism, misinformation, twitter

 

Dining Out in a Twitter Generation

Darcy Cobb, www.urbanspoon.com
New mobile devices, local apps, social media, rich media (from YouTube to Flickr) have transformed t... READ MORE New mobile devices, local apps, social media, rich media (from YouTube to Flickr) have transformed the way that consumers communicate with and about their favorite local businesses, and also the way that businesses reach out to audiences and manage their reputation. This panel will examine how local/mobile/social technologies and consumer behavior are changing the way we do business, what businesses need to do to become more plugged in, and which companies are driving innovation in the space.
Online Relationships Geo-tagging, Local mobile, online reputation

 

Developer Communities: Scaling Twitter-Like Ecosystems

Jonathan Markwell, Inuda
Over 75% of Twitter's traffic is to its API, from the ecosystem of applications that are built by th... READ MORE Over 75% of Twitter's traffic is to its API, from the ecosystem of applications that are built by third party developers. With other services headed in a similar direction, it's time to explore the challenges that come from scaling the communities of developers that form around APIs.
Community / Online Community community, developers, platforms

 

How Twitter Parties Can Help Your Biz

Crissy Herron, Indie Biz Chicks
Everyone knows that Twitter is a hot topic. There are lots of "gurus" and "experts" with opinions on... READ MORE Everyone knows that Twitter is a hot topic. There are lots of "gurus" and "experts" with opinions on using Twitter for business, but one thing these people often fail to mention are Twitter Parties! I host a weekly Twitter Party, #indiebizchat, and have also been a panelist on #GNO small business sessions. I know first hand that Twitter Parties are a great way to reach a wide audience, gain a large number of followers in a short amount of time, and of course, market your business. This presentation will teach people how to create Twitter Parties for free, how to find topics that will attract their target audience, how to find guests and panelists that can provide great info on the topic, how to use these parties to promote a business, as well as how to earn additional income through the parties.
Branding / Marketing / Publicity social media, twitter, Twitter Party

 

Facebook, Twitter & Beyond: What's a Parent to do?

Keri Pearlson, kp partners
This interactive session is a discussion hosted by a mom and her 14 year old daughter about using so... READ MORE This interactive session is a discussion hosted by a mom and her 14 year old daughter about using social networks. Mom works on social networks as part of her job. Her daughter played on NeoPets when she was 7. At 12 she asked for a 'Facebook'. As a young teen, she attacked Twitter. MySpace isn't of interest. She attended SXSWi sessions last year to learn more. Her parents set boundaries for each tool, and monitor her activities. She's always finding new apps, and trying them out. She likes to be on social networks while watching TV and would be on the computer all the time if not for the boundaries. Fights occur every time a boundary is pushed. What's a parent to do?
Social Networking Facebook, parenting, Social Networking

 

Twitter Annotations and the Real-Time Semantic Web

Joshua Shinavier, Franz Inc.
There's more to the real-time Web than snippets of text. Real-time services such as Twitter and Face... READ MORE There's more to the real-time Web than snippets of text. Real-time services such as Twitter and Facebook have, additionally, begun to provide rich, structured metadata for use by applications: data about places, events, web pages, and, with Twitter Annotations, anything else that can be described in JSON or XML. This data opens the door to mashups with the large bodies of linked data already deployed on the Web, enabling new and smarter applications. Instead of a stream of tweets tagged with #sxsw, for example, how about a stream of tweets by anyone attending SXSW, about films by young French directors or presentations by anyone the user has co-authored an article with. In such scenarios, the Semantic Web offers a shared information space in which applications can simultaneously interact with data from disparate datasets and real-time services, cutting down on case-by-case application logic and manual integration of data sources. This session will explore the intersection of Twitter Annotations with the Semantic Web, including 1) interlinking Annotations resources and vocabularies with the Web of Data 2) using graph databases for geospatial and temporal search on the Semantic Stream 3) Annotations and the Internet of Things 4) social network anaysis enabled by Annotations and linked data 5) tools and techniques for end-user application development
New Technology / Next Generation realtime, SemanticWeb, twitter

 

The Future of Local: Foursquare, Twitter and Yelp

Eric Singley, Yelp
Local search as we once knew it has been redefined. Today consumers are increasingly relying on loca... READ MORE Local search as we once knew it has been redefined. Today consumers are increasingly relying on location based mobile apps to discover what's around them. What does this shift in user behavior look like and how are small business owners leveraging these channels to attract new customers. The land grab for mindshare and a piece of the estimated $130+ billion local ad market* has been fast and furious. Product gurus from Foursquare, Twitter and Yelp sound off on what type of engagement they are seeing from both consumers and small businesses. (*BIA/Kelsey’s Local Media Annual Forecast, Feb 2010)
Geolocation local, Mobile Consumers, small business

 

Twittering in Jungles: Social Tools and Developing Economies

Annemarie Dooling, Independant
Travel has made a big name for itself on the Internet. Digital nomads use social networking to find... READ MORE Travel has made a big name for itself on the Internet. Digital nomads use social networking to find out complete news on new destinations, lodging and friends across the globe, and to contact service providers for problems, tips, and deals. Twitter, Facebook, Foursquare, and others, are making it easy for travelers to connect, find their passions and means, and head out into the open road. Philanthropy, not to be outdone, has made its mark, too. Twestival, Tweetsgiving, and the numerous crowd sourcing charity ideas that circle the web now make it easier than ever for you to give and make your mark on this world. It was only a matter of time before these two amazing social networking power genre’s got together: there’s a whole great big world out there that we are apart of. If you don’t stand up and change it, who will? Whether you’re a digital nomad keeping the States up to date on worldly situations via Facebook, or a fundraiser who’s crowd sourcing your mission to build schools in Haiti, you’re probably knee-deep in new social tools and wondering how you can best utilize them for your needs.
Social Issues crowdsource, Non-profit, travel

 

Twitter / Facebook – Your Customer Re-Activation Program

Greg Bright, Greg Bright
Want new customers? – Get SEO. Want to reactivate and engage existing customers? – Get Twitter a... READ MORE Want new customers? – Get SEO. Want to reactivate and engage existing customers? – Get Twitter and Facebook. Unless you are Zappos, you are wasting your time trying to gain new customers from Twitter or Facebook. Selling to existing customers is many times easier than fighting to get new ones. In fact, it’s up to 60 times easier. The pot of gold is right under your nose! Hear two veteran internet marketing professionals present case studies from restaurateurs in Austin and Fort Worth who are using social media to shine a spotlight on their businesses - while giving the owners one big ass megaphone.
Branding / Marketing / Publicity Internet marketing, search engine optimization, social media marketing

Food Goes Social! Marketing Kogi, Fatburger & Calbi

Mike Prasad, {M} Consultancy
The last 2 years have seen an explosion of food brands attempting to market via social strategies an... READ MORE The last 2 years have seen an explosion of food brands attempting to market via social strategies and technology. Some get it, and some don't. Learn the strategies behind three of the most successful and groundbreaking campaigns, with behind-the-scenes insight on how social marketing and branding was seeded, leveraged and grown, presented by the one person involved in all of them. Starting with a look at how Kogi BBQ used Twitter and spurred a worldwide mobile food trend, continuing with Fatburgers first use of Foursquare's location-based offer feature, then finally with Calbi BBQ’s issues of turning a “me too” brand to a distinctive offering, this presentation will look at each unique brands cases, issues, innovations and solutions that drove their engaging campaigns. Also included will be details on the Baja Fresh vs Kogi BBQ Twitter fight, and how Kogi BBQ won.
Branding / Marketing / Publicity food, Social Marketing & PR, twitter

The Anatomy of 140 Characters: Short Form Shakespeares

Victor Pineiro, Big Spaceship
Four years into Twitter and six into Facebook, short-form status updates are now many people's and b... READ MORE Four years into Twitter and six into Facebook, short-form status updates are now many people's and brands' primary medium for communication with their audience. If this is our new language, surely we have a few Shakespeares by now. Is there a formula for a perfect tweet? And is it art or science? We'll explore how far we've come, and what the hindrances of short-form dispatch-style communications are.
Social Networking Facebook, tweets, twitter

The Tweets That Time (And You) Forgot

Mick Darling, Tomorrowish LLC
The conversations on Twitter and other social media add value to the common discourse, but even as t... READ MORE The conversations on Twitter and other social media add value to the common discourse, but even as the conversations happen we miss massive pieces, and afterward they become very difficult to find. The reliance on #hashtags and lack of intelligent searching and filters on most twitter clients complicates this conversation gap resulting in a balkanization of the Twitter-sphere. At the last #140Conf in NY during the course of one hour over 75% of the tweets on topic about the conference would not have shown up in a search for "#140conf" which is the main way for outsiders to get in on the conversation. My company is collecting comprehensive conversations from events like the #140conf events in LA and Boston, major sporting events, television premieres and political events like the Presidential State of the Union Address. We will be collecting as much of the complete conversation from these events as we can using smart searching techniques, special filtering techniques, and good old fashioned human processing. Over the next year we will be harvesting the tweets from these events and will demonstrate how content producers and audiences can recapture lost conversations. At SXSWi 2011 we will compare what parts of these conversations are most talked about and demonstrate what the audience has been missing, providing insights and techniques to bring more people into the public conversations.
Community / Online Community analysis, conversion, twitter

Subscribers, Fans & Followers: The Audience Is Always Right

Jeffrey Rohrs, ExactTarget | CoTweet
Email. Facebook. Twitter. While they get lumped together as "social media," they are--for all intent... READ MORE Email. Facebook. Twitter. While they get lumped together as "social media," they are--for all intent and purposes--today's direct marketing trinity, a trio of online channels that connect billions of people around the world seamlessly. But how do consumers view and utilize each of these channels? Do their expectations differ when they're a brand subscriber versus a fan versus a follower? Should these consumer perceptions matter to marketers today? In 2010, we sought the answers to these questions and more by surveying over 1,500 consumers about their use of email, Facebook and Twitter. After publishing the results in a series of research papers (http://www.exacttarget.com/sff), we realized there was even more to the story and we set about literally writing the book on subscribers, fans and followers. What did we learn along the way? Join us for this interactive session and we'll share our favorite "Ah-ha moments" from the research and book. We'll also tell you what consumers love to love AND love to hate about your marketing via email, Facebook & Twitter. At the end of the session, you'll emerge with a wealth of actionable insights that can be put to immediate use with your own subscribers, fans and followers.
Social Networking email, Facebook, twitter

Caring For Your Online Introvert

Joanne McNeil, The Tomorrow Museum
If you are a geek, you are probably introverted. But you might not seem introverted online. In the c... READ MORE If you are a geek, you are probably introverted. But you might not seem introverted online. In the comfort of your own home, you can have endless conversations on message boards and mailing lists, have several instant message chats simultaneously, and thrive on these controlled social interactions. This also works in reverse. An overwhelmingly extroverted person could be too busy going to parties or talking on the phone to keep up with Facebook, appearing introverted when it comes to social media. Some of us are offline and online introverts. We don't like tagging people on Facebook. We don't @ reply very much on Twitter. Communicating through social media feels like small talk to the average online introvert. Social media can feel as draining as a cocktail party. Introverts sometimes appear standoffish to extroverts. Online introverts are similarly misunderstood by online extroverts. This panel will discuss the conflicts that occur between online extroverts and online introverts. We will also discuss "netiquette," as it relates to different personality types.
Social Issues Facebook, introverts, twitter

Putting the "MED" in Social Media

Julian Bond, Detroit Medical Center
As the use of social media sites such as Facebook, YouTube, and Twitter has become increasingly "sec... READ MORE As the use of social media sites such as Facebook, YouTube, and Twitter has become increasingly "second-nature" to many users, the way that patients, doctors and even hospital/health systems communicate with each other has changed dramatically. Instead of just the usual face-to-face doctor's office chats and traditional hospital marketing methods, social media has opened the doors to new ways of communication in the health community. So with more and more people using this new technology, a good number of health/hospital systems have caught on to it and as a result have started to put the "MED" in the social media movement. At SXSWi during the SXSW Health track, come check out our panel to hear some of the people from a number of major health and hospital systems as they discuss how they have used the power of social media to reach their core audiences in various unique ways.
Health Health , social media, twitter

Tweets from September 11

Charles Mangin, Option8, LLC
2011 will mark the 10th anniversary of the events of September 11, 2001. Since that day, the world h... READ MORE 2011 will mark the 10th anniversary of the events of September 11, 2001. Since that day, the world has changed in significant ways socially, politically and technologically. Consider recent natural and man-made disasters - earthquakes in Haiti and Chile, the BP oil spill in the Gulf of Mexico, Iceland's volcanic ash cloud - as well as politically divisive events - elections in Iran, US health care legislation. Facts, opinions and speculation for each new event spread faster than the last, through online social networks. More and more people are getting news of current events from sources like Twitter, and even network and cable news outlets are sourcing material from tweets and Facebook status updates. This panel will explore the emerging and historic role of social networks in disseminating news and information during disasters and other significant national and international events. It will also attempt to assess how differently the events of 9/11 would have been reported if Twitter and Facebook had been introduced to the world ten years earlier. With smartphones and handheld video cameras in the hands of thousands of people on the scene, would conspiracy theories and unanswered questions still swirl around Ground Zero? Would the events have changed at all, or their aftermath be different? In the context of these and other questions, we will speculate on how future disasters will be reported.
Social Issues 9/11, social web, twitter

 

You Should Be an Internet Personality.

Darrin Robertson, Orange Coast College
It has never been easier to be heard by so many people as it is in today's socially networked world.... READ MORE It has never been easier to be heard by so many people as it is in today's socially networked world. But what does that mean for us? What incentive is there for us to create anything online at all? Is our individual online presence and personality good for anything beyond simply getting attention? This panel will discuss the impact of tools such as twitter and tumblr on our ability to see and be seen, and why we should or should not be devoting time and energy to the pursuit of internet adoration.
Social Networking podcasting, social media, twitter

Social Media FAIL: Lessons From the Dark Side

Mark Williams, LiveWorld
Have you ever seen a social media FAIL in progress and wondered why the brand involved couldn't see ... READ MORE Have you ever seen a social media FAIL in progress and wondered why the brand involved couldn't see the mistakes they were making that made the situation worse? Doing everything 'right' in social media planning does not necessarily guarantee success, but doing the wrong thing will definitely ensure failure. This workshop takes a look at recent social media FAIL case studies to help intermediate to advanced social media practitioners acquire skills and strategies to manage their communities and social media campaigns better. What do you DO when you realize that your ongoing efforts are not achieving the results you were looking for or worse... you find yourself in the middle of a crisis or a social media FAIL storm that is brewing? This workshop looks at a few notable social media FAIL examples and then adds a comparison/contrast of brands who have successfully navigated treacherous waters and concludes with best practices to help extract your company from a bad social media experience. Bring your real-world problems--we'll spend at least 1/3rd of the time workshopping attendee issues.
Branding / Marketing / Publicity community, Facebook, twitter

 

Corporate Quandary: Who Should Own Social Media?

Chris Harris, Canwest Broadcasting
Marketing department. Publicity. Consumer relations. Digital team. A senior executive. Social media ... READ MORE Marketing department. Publicity. Consumer relations. Digital team. A senior executive. Social media guru-for-hire. The intern. All potential answers to a crucial business question - who should own a company’s social media strategy? The question has been with us for years but the answers remain diverse and debated. Through examination & discussion of successful case studies, this session explores how to maximize a brand’s social media efforts through organization, collaboration and authenticity.
Social Networking Corporate Social Media, social media management, twitter

Missing the Point: The Long v. Short-Form Debate

Thessaly La Force, The Paris Review
It's time to move on. Let's retire the debate about whether or not short-form writing on the web wil... READ MORE It's time to move on. Let's retire the debate about whether or not short-form writing on the web will replace long-form writing. In the same way that it's generally understood that bloggers will not replace journalists, Twitter will not take over The New Yorker. Tumblr will not eliminate the novel. But if we concede this point, then what are the ways in which these two forms will co-exist? Is it a peaceful but distant pairing? Is it symbiotic? Is it contentious? What does it mean to have community?
Journalism journalism, media, twitter

The Harassment Predicament: Minimizing Abuse, Maximizing Free Speech

Del Harvey, Twitter
Online services tread a narrow line between enabling free speech and preventing abuse of members. Of... READ MORE Online services tread a narrow line between enabling free speech and preventing abuse of members. Offline, harassment is often determined contextually; unfortunately, website owners and operators often lack the time, insight, and ability to determine the context surrounding a given behavior. Additionally, the speech itself may not be directly abusive; thus, identifying other vectors for abuse is becoming increasingly important. As a result, Del Harvey, the Director of Twitter's Trust and Safety department, has spent a significant amount of the past two years working to develop objective litmus tests for evaluating potentially abusive behavior in the absence of context. This presentation will draw upon the work done at Twitter as well as Del's previous background working with online safety advocates to provide practical and doable policies and suggestions for sites to utilize with a minimum of engineering investment and personnel needs.
Social Issues free speech, internet safety, twitter

 

Tiny Strategies: Social Media in 60 Minutes or Less

Annie Lynsen, Small Act
In this quick-hit, practical-tips-focused panel, nonprofit social media experts will share their str... READ MORE In this quick-hit, practical-tips-focused panel, nonprofit social media experts will share their strategies for maximizing their social media impact with very little time to devote to it. We'll go round-robin style to find out how you should spend your time, whether you have 10 minutes, 30 minutes, or 60 minutes a day to devote to social media.
Social Networking Facebook, nonprofit, twitter

 

 

Tweet and Check-in Your Way to Love

Laurie Davis, eFlirt Expert
You notice her tweet, check-in or status update and feel a tingle through your iPad. Could it be ins... READ MORE You notice her tweet, check-in or status update and feel a tingle through your iPad. Could it be instant virtual attraction? Though you’re in-touch with your digital aura, how can you translate that into finding love? We’ll teach you how to flirt in real-time and transition your avatars seamlessly offline. From the first tweet to the last check-in, this session will teach you how to best position your love life in a 2.0 world.
Online Relationships dating, Real-Time, twitter

Own Social Media; They Do

Jay Dolan, The Anti-Social Media
All of the major social networks are controlled by private companies. You and your business are put... READ MORE All of the major social networks are controlled by private companies. You and your business are putting your social strategy in the hands of people you've likely never met. You pray they don't change the terms of service yet again. You hope that another fail whale doesn't surface just as you start trying to launch your next campaign. All the while, they glean information you put out there for their own money-making purposes The time is now to start telling our stories in a way that isn't controlled by a handful of corporations. The problem of the closed social network goes beyond just making one based on values of open-source and privacy such as Diaspora. It involves deeper questions of the value of our real identity online, and whether the convenience and entertainment of social networks is worth the sacrifice. It's time to start realizing how big the problem is and why the time for an open solution is now.
Social Networking Facebook, twitter

 

Automated Content in Social Media the Right Way

Tatyana Kanzaveli, Social CRM World
This will be a controversial topic, especially for so proclaimed social media experts... We will ove... READ MORE This will be a controversial topic, especially for so proclaimed social media experts... We will overview ways to extract great content and share it with your friends and followers on social media channels in automated ways: auto-tweets, auto blog posts. I will give concrete examples on how this concept works. Tested for a long time on a number of twitter accounts, facebook pages, etc.. We will look at content aggregation tools and their use in content management space.
Content Management content creation, content curation, twitter

 

Going Social? Start with Your Own Web Property

Jason Jaynes, Demand Media
Web sites are dead! Facebook and Twitter should be your focus! At least, that’s what some pundits ... READ MORE Web sites are dead! Facebook and Twitter should be your focus! At least, that’s what some pundits would have you believe. But in reality there’s no place better than your own Web site for generating brand loyalty, giving your customers a place to energize and support each other, and gaining insights into your customers and your market. You ignore your own Web site at your peril, especially now that it’s so easy to connect your property – on your terms – to the Web’s giant social destinations. How do you properly integrate with social network sites such as Facebook, Twitter, LinkedIn? What types of social applications should you consider incorporating directly into your site? What types of traffic increases and time-on-site can be expected by incorporating these techniques? How do you accurately measure the business value success of your social investments? What types of marketing benefits can be leveraged by having a community on your site? All these questions and more will be discussed, with illustrations from real-world brand and publisher deployments.
Social Networking Facebook, owned media, twitter

 

Enhanced by Zemanta

          Distill: Is This What Journals Should Look Like?        
A month ago a post on the Y Combinator blog announced that they and Google have launched a new academic journal called Distill. Except this is no ordinary journal consisting of slightly enhanced PDFs, it is a big step towards the way academic communication should work in the Web era:
The web has been around for almost 30 years. But you wouldn’t know it if you looked at most academic journals. They’re stuck in the early 1900s. PDFs are not an exciting form.

Distill is taking the web seriously. A Distill article (at least in its ideal, aspirational form) isn’t just a paper. It’s an interactive medium that lets users – “readers” is no longer sufficient – work directly with machine learning models.
Below the fold, I take a close look at one of the early articles to assess how big a step this is.

How to Use t-SNE Effectively is one of Distill's launch articles. It has a DOI - doi:10.23915/distill.00002. It can be cited like any other paper:
For attribution in academic contexts, please cite this work as

Wattenberg, et al., "How to Use t-SNE Effectively", Distill, 2016. http://doi.org/10.23915/distill.00002

BibTeX citation

{wattenberg2016how,
author = {Wattenberg, Martin and Viégas, Fernanda and Johnson, Ian},
title = {How to Use t-SNE Effectively},
journal = {Distill},
year = {2016},
url = {http://distill.pub/2016/misread-tsne},
doi = {10.23915/distill.00002}
}
But this really isn't a conventional article:

Updates and Corrections

View all changes to this article since it was first published. If you see a mistake or want to suggest a change, please create an issue on GitHub. ... with the source available on GitHub.
The sub-head explains the article's goal:
Although extremely useful for visualizing high-dimensional data, t-SNE plots can sometimes be mysterious or misleading. By exploring how it behaves in simple cases, we can learn to use it more effectively.
Which is where it starts to look very different. It matches the goal set out in the blog post:
Ideally, such articles will integrate explanation, code, data, and interactive visualizations into a single environment. In such an environment, users can explore in ways impossible with traditional static media. They can change models, try out different hypotheses, and immediately see what happens. That will let them rapidly build their understanding in ways impossible in traditional static media.
And the article itself isn't static, its more like a piece of open-source software:

Citations and Reuse

Diagrams and text are licensed under Creative Commons Attribution CC-BY 2.0, unless noted otherwise, with the source available on GitHub. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: “Figure from …”.
So far, so much better than a PDF, as you can see by visiting the article and playing with the examples, adjusting the sliders to see how the parameters affect the results.

But is this article as preservable as a PDF? You can try an interesting experiment. On your laptop, point your browser at the article, wait for it to load and show that the examples work. Now turn off WiFi, and the examples continue to work!

Using "View Source" you can see that the functionality of the article is implemented by a set of JavaScript files:
<script src="assets/d3.min.js"></script>
<script src="assets/tsne.js"></script>
<script src="assets/demo-configs.js"></script>
<script src="assets/figure-configs.js"></script>
<script src="assets/visualize.js"></script>
<script src="assets/figures.js"></script>

which are installed in your browser during page load and can be captured by a suitably configured crawler. So it is in principle as preservable as a PDF.

The Wayback Machine's first capture of the article is only partly functional, probably because the Internet Archive's crawler wasn't configured to capture everything. Webrecorder.io collects a fully functional version.

Here is a brief look at some of the other articles now up at Distill:
  • Attention and Augmented Recurrent Neural Networks, Olah & Carter and Deconvolution and Checkerboard Artifacts, Odena et al both contain interactive diagrams illustrating the details of the algorithms they discuss. These again work when networking is disabled. Thus they both seem to be preservable.
  • In Four Experiments in Handwriting with a Neural Network, Carter et al write:
    Neural networks are an extremely successful approach to machine learning, but it’s tricky to understand why they behave the way they do. This has sparked a lot of interest and effort around trying to understand and visualize them, which we think is so far just scratching the surface of what is possible.

    In this article we will try to push forward in this direction by taking a generative model of handwriting and visualizing it in a number of ways. The model is quite simple (so as to run well in the browser) so the generated output mostly produces gibberish letters and words (albeit, gibberish that look like real handwriting), but it is still useful for our purposes of exploring visualization techniques.
    Thus, like the Wattenberg et al article, this paper actually contains an implementation of the algorithm it discusses. In this case it is a model derived by machine learning, albeit one simple enough to run in the browser. Again, you can disable networking and show that the article's model and the animations remain fully functional.
  • Why Momentum Really Works, Gabriel Goh is similar, in that its interactive diagrams are powered by an implementation of the optimzation technique it describes, which is again functional in the absence of network connectivity.
Clearly, Distill articles are a powerful way to communicate and explain the behavior of algorithms for machine learning. But there are still issues. Among the non-technical issues are:
  • Since Distill articles are selected via a pre-publication peer review (they are also subject to post-publication review via GitHub), each needs a private GitHub repository during review, which is a cost presumably borne by the authors. But there don't appear to be any author processing charges (APCs).
  • There is a separate Distill Prize for "outstanding work communicating and refining ideas" with an endowment $125K:
    The Distill Prize has a $125,000 USD initial endowment, funded by Chris Olah, Greg Brockman, Jeff Dean, DeepMind, and the Open Philanthropy Project. Logistics for the prize are handled by the Open Philanthropy Project.
  • The prize endowment does not explain how the journal itself is funded. It isn't clear how the costs will be covered. Distill is open access, and does not appear to levy APCs. I can't find any information about how the journal is funded on the site. The journal is presumably sponsored to some extent by the deep pockets of Google and Y Combinator, which could raise issues of editorial independence.
  • The costs of running the journal will be significant. There are the normal costs of the editorial and review processes, and running costs of the Web site. But in addition, the interactive graphics are of extremely high quality,  due presumably not to graphic desing talent among the authors but to Distill's user interface design support:
    Distill provides expert editing to help authors improve their writing and diagrams.
    The editors' job is presumably made easier by the suite of tools provided to authors, but this expertise also costs money.
  • Distill does appear committed to open access to research. Attention and Augmented Recurrent Neural Networks has 21 references. An example is a paper published in the Journal of Machine Learning Research as Proceedings of the 33rd International Conference on Machine Learning. It appears as:
    Ask Me Anything: Dynamic Memory Networks for Natural Language Processing  [PDF]
    Kumar, A., Irsoy, O., Su, J., Bradbury, J., English, R., Pierce, B., Ondruska, P., Gulrajani, I. and Socher, R., 2015. CoRR, Vol abs/1506.07285.
    citing and linking to the Computing Research Repository at arxiv.org rather than the journal. JMLR doesn't appear to have DOIs via which to link. 14/21 of the article's references similarly obscure the actual publication (even if it is open access), and another two are to preprints. Instead they link to the open access version at arxiv.org. Presumably this is an editorial policy.
Among the technical issues are:
  • Distill is a journal about machine learning, which is notoriously expensive in computation. There are limits as to how much computation can feasibly be extracted from your browser, so there will clearly be topics for which an appropriate presentation requires significant support from data-center resources. These will not be preservable.
  • Machine learning also notoriously requires vast amounts of data. This is another reason why some articles will require data-center support, and will not be preservable.
  • GitHub is not a preservation repository, so the conversation around the articles will not automatically be preserved.
  • If articles need data-center support, there will be unpredictable on-going costs that somehow need to be covered, a similar problem to the costs involved in cloud-based emulation services.
  • The title of an article's GitHub repository looks like this:
    How to Use t-SNE Effectively http://distill.pub/2016/misread-tsne/
    but the link should not be direct to the distill.pub website but should go via the DOI https://doi.org/10.23915/distill.00002. This is yet another instance of the problem, discussed in Persistent URIs Must Be Used To Be Persistent, by Herbert van de Sompel et al, of publishers preferring to link directly not to DOIs, and thus partially defeating their purpose.
The interactive diagrams and examples that provide the pedagogic power of Distill are constrained by the limits of what can be implemented in a browser. Communicating about machine learning at even fairly small scale will be out of reach. For example, Distill obviously could not publish anything like End to End Learning for Self-Driving Cars by Bojarksi et al, which describes how NVIDIA used one of their boxes to learn, and another to execute, a model capable of autonomously steering a car in traffic on the New Jersey Turnpike. The compute power of these boxes is far greater than is available from a browser. Even if data center support were available. the browser still lacks a camera, steering, brakes, wheels, and an engine. Not to mention a professional self-driving car driver.

Note that Distill's use of GitHub is similar to the way the Journal of Open Source Software operates, but JOSS doesn't support execution of the software in the way Distill does, so does not require data-center support. Nor, since it covers only open source code, does it need private repositories.

In summary, given the limitations of the browser's execution environment, Distill does an excellent job of publishing articles that use interactivity to provide high-quality explanations and the ability to explore the parameter space. It does so without sacrificing preservability, which is important. It isn't clear how sustainable the journal will be, nor how much of the field of machine learning will be excluded by the medium's limitations.
          Google uses machine learning for new security features in Gmail        

Google has pushed four new security features to enterprise users on G Suite, the search giant's hosted business offering. The new protections come shortly after Citizen Lab report exposed a Russia-linked Phishing and disinformation campaign using Google services, including Gmail.

Google makes no mention of the Citizen Lab report in their posts on the new security features, but many of the protections take aim at common Phishing techniques used to steal data and credentials.

The Citizen Lab report describes a Phishing and disinformation campaign by Russian actors, which targeted more than 200 people across 39 countries.

To read this article in full or to leave a comment, please click here


          New app enables conservationists to quickly mine research for key insights        

Colandr Aids Practitioners and Policymakers in Making Faster, More Timely Science-Based Decisions

Determining the best course of action for protecting an ecosystem and the human livelihoods dependent on it is no quick and easy process, despite the urgency often felt around it. It can take months, even years of sifting through piles of studies to track down the evidence needed to make the right decision – until now.

Researchers from the Science for Nature and People Partnership (SNAPP), in partnership with Conservation International and DataKind, recently launched Colandr, an open-access machine learning application that allows for faster sifting and winnowing of scientific data to help conservation practitioners and policymakers find the evidence they need to make science-based decisions more quickly than ever before.

More>

 


          Kuri robot will (hopefully) record your family's precious moments        

If you're a parent, you probably dread the thought of missing an important moment in your child's life. Do you really want to be in the other room when your little one takes those first steps? Mayfield Robotics thinks it can be there even when you can't. It's adding yet another feature to its upcoming Kuri home robot that will record moments independently. The tiny companion will use a mix of machine learning and image recognition to determine when it should start capturing video, using your preferences as a guide. Ideally, this will catch your kids' playtime or an impromptu dance party without asking you to lift a finger -- and the more it records, the more it should understand your tastes.

Source: Kuri Blog


          Google AI could keep baby food safe        

Google's artificial intelligence technology can help the food industry beyond picking better cucumbers. In one company's case, it could prevent your child from getting sick. Japanese food producer Kewpie Corporation has revealed that it's using Google's TensorFlow to quickly inspect ingredients, including the diced potatoes it uses in baby food. The firm and its partner BrainPad trained the machine learning system to recognize good ingredients by feeding it 18,000 photos, and set it to work looking for visual 'anomalies' that hint at sub-par potatoes. The result was an inspection system with "near-perfect" accuracy, culling more defective ingredients than humans alone -- even with a conveyor belt shuttling potatoes along at high speed.

Source: Google


          Apple launches a machine learning blog to placate its researchers        

Apple hasn't always been very open about its technology or its research, but the company surprised everyone last year when AI director Russ Salakhutdinov announced that Apple would begin publishing its machine learning research. Shortly thereafter, it published its first AI paper in an academic journal and today Apple takes its transparency another step with the debut of its Machine Learning Journal.

Via: 9to5Mac

Source: Apple Machine Learning Journal


          Tech News Today 1820: Not Your Mother's Tinder        

Tech News Today (MP3)

Happy Cheap Tesla Day, if $35,000 before customization is cheap to you. Tesla delivers its first model three on Friday night, and ReCode says this could be Elon Musk's iPhone moment. The car is sold out through most of 2018 with 350,000 preorders in April 2016 alone.

Anonymous social networking service Whisper is laying off 20 percent of its workforce this week. The editorial team saw the brunt of the hits as the company relies further on machine learning to handle some of those media management tasks.

Who among us hasn't had our favorite hangouts shut down because they've hit on hard times? One would have hoped that this would be something we could avoid in the virtual world. Not so. Today, Comcast-backed Altspace VR announced that the company is shutting down on August 3rd because the company has run into unforeseen financial difficulties.

Plus, more details around the Broadcom WiFi worm that affected a billion smartphones, iRobot won't actually sell Roomba's data of your floor plan, and Sam Machkovech from Ars Technica helps us compare OK Cupid to Tinder, and has details on Matt Groenings new Netflix series.

Hosts: Megan Morrone and Jason Howell

Guest: Sam Machkovech

Download or subscribe to this show at https://twit.tv/shows/tech-news-today.

Thanks to CacheFly for the bandwidth for this show.


          ã‚なたの業務に機械学習を活用する5つのポイント        

2014/08/26 Machine Learning Casual Talks #2「あなたの業務に機械学習を活用する5つのポイント」の発表資料です。 http://mlct.connpass.com/event/8036/
          A VerySpatial Podcast – Episode 575        
A VerySpatial Podcast Shownotes – Episode 575 24 July 2017 The impacts of Artificial Intelligence and Machine Learning, and related technologies of Expert Systems and Neural Networks, on the geospatial industry and society (on our 12th Anniversary). Click to directly download MP3 Music I Feel Fantastic by Jonathan Coulton Topic This week we talk about […]
          IntMath Newsletter: Domain, range, Azure, Riemann        

In this Newsletter:

1. New applet: Domain and range exploration
2. Resource: Azure Machine Learning Studio
3. Math in the news: Riemann hypothesis
4. Math movie: Is math discovered or invented?
5. Math puzzles: Bees
6. Final thought: Bees


          Machine Learning Analytics (Part 3)        
This article is part of a series covering how changes still present a risk for today's IT operations, despite advances in technology and processes, and how a change-centric analytics approach...

[[ This is a content summary only. Click on the blog title to continue reading this post! ]]

          Customer Data Meets AI        
A new day is dawning for the customer experience, driven by the application of artificial intelligence, machine learning, and automated technologies to CRM data. The potential exists to transform the customer’s experience by providing service in a more predictive and intuitive way than ever before.
          The Future Of The Web Is Audible        
Like it or not the web has mostly been designed for those who can see it. The very nature of HTML and CSS is focused on how a web page looks, mostly disregarding our other senses. With the increasing popularity of wearable technology combined with advancements in machine learning, a [...]
          Google app is getting a smarter, more personalized feed starting today, thanks to machine learning        

Machine learning is all the rage these days, and the engineers at Google know that firsthand. The technology is being integrated into so many Google products these days, with one of those things being the feed in the Google app. Today, our favorite tech company is improving on this with even more features that will help you get the news that you want out of the Google feed.

Content aggregation for your feed is getting a bit of an upgrade.

Read More

Google app is getting a smarter, more personalized feed starting today, thanks to machine learning was written by the awesome team at Android Police.


          Machine learning can predict rate of memory change        

Scientific Reports: Identification of clusters of rapid and slow decliners among subjects at risk for Alzheimer’s disease In new research published today, researchers have created a machine learning algorithm that is able to form two distinct groups of people who have early memory problems known as mild cognitive impairment. The algorithm was able to predict […]

The post Machine learning can predict rate of memory change appeared first on Alzheimer's Research UK.


          Participation: ECMLPKDD2016        
I participated in the ECMLPKDD2016 conference held at Riva del Garda, Italy. The 27th European Conference on Machine Learning and the 20th Principles and Practice of Knowledge Discovery in Databases Summary on Twitter (in Japanese)
          AI: The promise and the peril        

Mommas, don’t let your babies grow up to be truck drivers. Or pretty much anything that a machine or a robot could do, if you want them to have a job. The list of those things will continue to get longer – in some cases rapidly – extending well beyond the assembly line on a factory floor.

The forecast is not all gloomy – artificial intelligence (AI), machine learning (ML) and automation are also expected to create jobs that will likely be much more interesting and creative than the repetitive tasks of the industrial age.

Indeed, it has been a growing component of cybersecurity technology, and therefore cybersecurity jobs, for several years. Former Symantec CTO Amit Mital (now manager at KRNL Labs), at a panel discussion sponsored by Fortune magazine in 2015, called AI one of the “few beacons of hope in this mess” – the mess being cybersecurity, which he contended is “basically broken.”

To read this article in full or to leave a comment, please click here


          BioHiTech Global Launches BHTG Smart Mode Technology        
CHESTNUT RIDGE, N.Y.—BioHiTech Global, Inc., a green technology company that develops and deploys innovative and disruptive waste management technologies, announced the launch of BHTG Smart Mode, a new technology for its Eco-Safe Digester and Revolution Series Digester lines of on-site food waste disposal equipment. Smart Mode leverages cloud computing, machine learning, and the on-board internet-connected […]
          IBM speeds deep learning by using multiple servers        

For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework. It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

To read this article in full or to leave a comment, please click here


          Machine Learning in Clojure - part 2        
I am trying to implement the material from the Machine Learning course on Coursera in Clojure.

My last post was about doing linear regression with 1 variable. This post will show that the same process works for multiple variables, and then explain why we represent the problem with matrices.

The only code in this post is calling the functions introduced in the last one. I also use the same examples, so post this will make a lot more sense if you read that one first.

For reference, here is the linear regression function:

(defn linear-regression [x Y a i]
(let [m (first (cl/size Y))
X (add-ones x)]
(loop [Theta (cl/zeros 1 (second (cl/size X))) i i]
(if (zero? i)
Theta
(let [ans (cl/* X (cl/t Theta))
diffs (cl/- ans Y)
dx (cl/* (cl/t diffs) X)
adjust-x (cl/* dx (/ a m))]
(recur (cl/- Theta adjust-x)
(dec i)))))))


Because the regression function works with matrices, it does not need any changes to run a regression over multiple variables.

Some Examples

In the English Premier League, a team gets 3 points for a win, and 1 point for a draw. Trying to find a relationship between wins and points gets close to the answer.

(->> (get-matrices [:win] :pts)
reg-epl
(print-results "wins->points"))

** wins->points **
A 1x2 matrix
-------------
1.24e+01 2.82e+00


When we add a second variable, the number of draws, we get close enough to ascribe the difference to rounding error.

(->> (get-matrices [:win :draw] :pts)
reg-epl
(print-results "wins+draws->points"))

** wins+draws->points **
A 1x3 matrix
-------------
-2.72e-01 3.01e+00 1.01e+00

In the last post, I asserted that scoring goals was the key to success in soccer.

(->> (get-matrices [:for] :pts)
reg-epl
(print-results "for->points"))


** for->points **
A 1x2 matrix
-------------
2.73e+00 9.81e-01

If you saw Costa Rica in the World Cup, you know that defense counts for a lot too. Looking at both goals for and against can give a broader picture.

(->> (get-matrices [:for :against] :pts)
reg-epl
(print-results "for-against->pts"))


** for-against->pts **
A 1x3 matrix
-------------
3.83e+01 7.66e-01 -4.97e-01


The league tables contain 20 fields of data, and the code works for any number of variables. Will adding more features (variables) make for a better model?

We can expand the model to include whether the goals were scored at home or away.

(->> (get-matrices [:for-h :for-a :against-h :against-a] :pts)
reg-epl
(print-results "forh-fora-againsth-againsta->pts"))


** forh-fora-againsth-againsta->pts **
A 1x5 matrix
-------------
3.81e+01 7.22e-01 8.26e-01 -5.99e-01 -4.17e-01

The statistical relationship we have found suggests that that goals scored on the road are with .1 points more than those scored at home. The difference in goals allowed is even greater; they cost .6 points at home and only .4 on the road.

Wins and draws are worth the same number of points, no matter where the game takes place, so what is going on?

In many sports there is a “home field advantage”, and this is certainly true in soccer. A team that is strong on the road is probably a really strong team, so the relationship we have found may indeed be accurate.

Adding more features indiscriminately can lead to confusion.

(->> (get-matrices [:for :against :played :gd :for-h :for-a] :pts)
reg-epl
(map *)
(print-results "kitchen sink”))

** kitchen sink **
(0.03515239958218979 0.17500425607459014 -0.22696465757628984 1.3357911841232217 0.4019689136508527 0.014497060396707949 0.1605071956778842)


When I printed out this result the first time, the parameter representing the number of games played displayed as a decimal point with no digit before or after. Multiplying each term by 1 got the numbers to appear. Weird.

The :gd stands for “goal difference” it is the difference between the number of goals that a team scores and the number they give up. Because we are also pulling for and against, this is a redundant piece of information. Pulling home and away goals for makes the combined goals-for column redundant as well.

All of the teams in the sample played the same number of games, so that variable should not have influenced the model. Looking at the values, our model says that playing a game is worth 1.3 points, and this is more important than all of the other factors combined. Adding that piece of data removed information.

Let’s look at one more model with redundant data. Lets look at goals for, against and the goal difference, which is just the difference of the two.

(->> (get-matrices [:for :against :gd] :pts)
reg-epl
(print-results "for-against-gd->pts"))

** for-against-gd->pts **
A 1x4 matrix
-------------
3.83e+01 3.45e-01 -7.57e-02 4.21e-01


points = 38.3 + 0.345 * goals-for - 0.0757 * goals-against + 0.421 * goal-difference

The first term, Theta[0] is right around 38. If a team neither scores nor allows any goals during a season, they will draw all of their matches, earning 38 points. I didn’t notice that the leading term was 38 in all of the cases that included both goals for and against until I wrote this model without the exponents.

Is this model better or worse than the one that looks at goals for and goals against, without goal difference. I can’t decide.

Why Matrices?

Each of our training examples have a series of X values, and one corresponding Y value. Our dataset contains 380 examples (20 teams * 19 seasons).
Our process is to make a guess as to the proper value for each parameter to multiply the X values by and compare the results in each case to the Y value. We use the differences between the product of our guesses, and the real life values to improve our guesses.

This could be done with a loop. With m examples and n features we could do something like

for i = 1 to m 
guess = 0
for j = 1 to n
guess = guess + X[i, j] * Theta[j]
end for j
difference[i] = guess - Y
end for i

We would need another loop to calculate the new values for Theta.

Matrices have operations defined that replace the above loops. When we multiply the X matrix by the Theta vector, for each row of X, we multiply each element by the corresponding element in Theta, and add the products together to get the first element of the result.

Matrix subtraction requires two matrices that are the same size. The result of subtraction is a new matrix that is the same size, where each element is the difference of the corresponding elements in the original matrices.

Using these two operations, we can replace the loops above with

Guess = X * Theta
Difference = Guess - Y

Clearly the notation is shorter. The other advantage is that there are matrix libraries that are able to do these operations much more efficiently than can be done with loops.

There are two more operations that our needed in the linear regression calculations. One is multiplying matrices by a single number, called a scalar. When multiplying a matrix by a number, multiply each element by that number. [1 2 3] * 3 = [3 6 9].

The other operation we perform is called a transpose. Transposing a matrix turns all of its rows into columns, and its columns into rows. In our examples, the size of X is m by n, and the size of Theta is 1 x n. We don’t have any way to multiply an m by n matrix and a 1 by n matrix, but we can multiply a m by n matrix and an n by 1 matrix. The product will be an m by 1 matrix.

In the regression function there are a couple of transposes to make the dimensions line up. That is the meaning of the cl/t expression. cl is an alias for the Clatrix matrix library.

Even though we replaced a couple of calculations that could have been done in loops with matrix calculations, we are still performing these calculations in a series of iterations. There is a technique for calculating linear regression without the iterative process called Normal Equation.

I am not going to discuss normal equation for two reasons. First, I don’t understand the mathematics. Second the process we use, Gradient Descent, can be used with other types of machine learning techniques, and normal equation cannot.

          Linear Regression in Clojure, Part I        
Several months ago I recommended the Machine Learning course from Coursera. At the time, I intended to retake the course and try to implement the solutions to the homework in Clojure. Unfortunately, I got involved in some other things, and wasn’t able to spend time on the class. 

Recently, a new book has come out, Clojure for Machine Learning. I am only a couple of chapters in, but it has already been a good help to me. I do agree with this review that the book is neither a good first Clojure book, or a good first machine learning resource, but it does join the two topics well.

Linear Regression
The place to start with machine learning is Linear Regression with one variable. The goal is to come up with an equation in the familiar form of y = mx + b, where x is the value you know and y is the value you are trying to predict. 

Linear regression is a supervised learning technique. This means that for each of the examples used to create the model the correct answer is known. 

We will use slightly different notation to represent the function we are trying to find. In place of b we will put Theta[0] and in place of m we will put Theta[1]. The reason for this, is that we are going to be using a generalized technique that will work for any number of variables, and the result of our model will be a vector called Theta. 

Even though our technique will work for multiple variables, we will focus on predicting based on a single variable. This is conceptually a little simpler, but more importantly it allows us to plot the input data and our results, so we can see what we are doing.

The Question
A number of years ago I read the book Moneyball, which is about the application of statistics to baseball. One of the claims in the book is that the best predictor for the number of games a baseball team wins in a season is the number of runs they score that season. To improve their results, teams should focus on strategies that maximize runs.

The question I want to answer is whether the same is true in soccer: Are the number of points a team earns in a season correlated with the number of goals that they score. For any that don’t know, a soccer team is awarded 3 points for a win and 1 point for a tie.

The importance of goals is a relevant question for a Manchester United fan. At the end of the 2012-13 season, head coach Sir Alex Ferguson retired after winning his 13th Premier League title. He was replaced by David Moyes. Under Moyes the offense which had been so potent the year before looked clumsy. Also, the team seemed unlucky, giving up goals late in games, turning wins into draws and draws into defeats. The team that finished 1st the year before finished 7th in 2013-14. Was the problem a bad strategy, or bad luck?

The Data
I have downloaded the league tables for the last 19 years of the English Premier League from stato.com. There have actually been 22 seasons in the Premier League, but in the first 3 seasons each team played 42 games, vs 38 games for the last 19 seasons, and I opted for consistency over quantity.

I actually want to run 3 regressions, first one on a case where I am sure there is a correlation, then on a case where I am sure there is not, and then finally to determine whether a correlation exists between goals and points. 

There should be a high correlation between the number of wins a team has and their number of points. Since every team plays the same number of games, there should be no correlation between the number of games played and a teams position in the standings.

The Process
We will use a technique called gradient descent to find the equation we want to use for our predictions. We will start with an arbitrary value for Theta[0] and Theta[1]; setting both to 0. We will multiply each x value by Theta[1] and add Theta[0], and compare that result to the corresponding value of Y. We will use the differences between Y and the results of Theata * X to calculate new values for Theta, and repeat the process.

One way of measuring the quality of the prediction is with a cost function that measures the mean square error of the predictions. 

1/2m * sum(h(x[i]) - y[i])^2

Where m is the number of test cases we are evaluating, and h(x[i]) is the predicted value for a test case i. We will not use the cost function directly, but its derivative is used in improving our predictions of Theta as follows:

Theta[0] = Theta[0] - alpha * 1/m * sum(h(x[i]) - y([i])
Theta[1] = Theta[1] - alpha * 1/m * sum((h(x[i]) - y([i])  * x[i]) 

We have added one more symbol here. alpha is called the learning rate. The learning rate determines how much we modify Theta each iteration. If alpha is set too high, the process will oscillate between Thetas that are too low and two high and the process will never converge. When alpha is set lower than necessary, extra iterations are necessary to converge.

I need to mention again that this methodology and these equations come directly from Professor Ng’s machine learning course on Coursera that I linked above. He spends over an hour on linear regression with one variable, and if you want more information that is the place to go.

The Code
The actual calculations we are going to do are operations on matrices. When we multiply the matrix X by the matrix Theta, we obtain a matrix of predictions that can be compared element by element with the matrix Y. The same results could be obtained by looping over each test case, but expressing the computations as matrix operations yields simpler equations, shorter code and better performance.

I used the clatrix matrix library for the calculations.

One other thing to note, in the equations above, Theta[0] is treated differently than Theta[1], it is not multiplied by any x terms, either in the predictions or in the adjustments after the predictions. If we add an additional column to our X matrix, an X[0], and make all of the values in this column 1, we then no longer have to make a distinction between Theta[0] and Theta[1].

(defn add-ones "Add an X[0] column of all 1's to use with Theta[0]"
  [x]
  (let [width (first (cl/size x))
        new-row (vec (repeat width 1))
        new-mat (cl/matrix new-row)]
    (cl/hstack new-mat x)))

(defn linear-regression [x Y a i]
  (let [m (first (cl/size Y))
        X (add-ones x)]
    (loop [Theta (cl/zeros 1 (second (cl/size X))) i i]
      (if (zero? i)
        Theta
        (let [ans (cl/* X (cl/t Theta))
              diffs (cl/- ans Y)
              dx (cl/* (cl/t diffs) X)
              adjust-x (cl/* dx (/ a m))]
          (recur (cl/- Theta adjust-x)
                   (dec i)))))))

The linear-regression function takes as parameters the X and Y values that we use for training, the learning rate and the number of iterations to perform. We add a column of ones to the passed in X values. We initialize the Theta vector, setting all the values to 0. 

At this point X is a matrix of 380 rows and 2 columns. Theta is a matrix of 1 row and 2 columns. If we take the transpose of Theta (turn the rows into columns, and columns into rows) we get a new matrix, Theta’ which has 2 rows and 1 columns. Multiplying the matrix X with Theta’ yields a matrix of 380x1 containing all of the predictions, and the same size as Y.  

Taking the difference between the calculated answers and our known values yields a 380x1 matrix. We transpose this matrix, making it 1x380, and multiply it by our 380x2 X matrix, yielding a 1x2 matrix. We multiply each element in this matrix by a and divide by m, ending up with a 1x2 matrix which has the amounts we want to subtract from Theta, which is also a 1x2 matrix. All that is left to do is recur with the new values for Theta.

The Results
Since I am going to be performing the same operations on three different data sets, I wrote a couple of helper functions. plot-it uses Incanter to display a scatter plot of the data. reg-epl calls the linear-regression function specifying a learning rate of .0001 and 1000000 iterations. I also have a get-matrices function, which downloads the data and creates the X and Y matrices for the specified fields.

(def wins (get-matrices [:win] :pts))
(plot-it wins)
(def win-theta (reg-epl wins))
(println "Wins-points: " win-theta)

Yields this graph



and these results

Wins-points:   A 1x2 matrix
 -------------
 1.24e+01  2.82e+00

The relationship between wins and points is obvious in the graph. The equation we developed estimates wins as being worth 2.82 points, instead of the correct 3. This is because it had no way to account for draws, and use a high intercept to get those extra points in there.

A team with 0 wins would be expected to have 12.4 points. A team with 10 wins would have 12.4 + 2.82 * 10 = 40.6 points. A team with 20 wins would have 12.4 + 2.82 * 25 = 
82.9 points.

(def played (get-matrices [:played] :rank))
(plot-it played)
(def played-theta (reg-epl played))
(println "played-rank: " played-theta)
(println "expected finish:" (+ (first played-theta)
                               (* 38 (second played-theta))))

Playing 38 games gives you an equal chance of having a finishing position anywhere between 1 and 20. The graph gives a good illustration of what no-correlation looks like.



If we use the terms in Theta to find the expected finishing position for a team playing 38 games, we find exactly what we expect, 10.5.

played-rank:   A 1x2 matrix
 -------------
 7.27e-03  2.76e-01

expected finish: 10.499999999999996

Ok, now that we have seen what it looks like when we have a strong correlation, and no correlation, is there a correlation between goals and points?

(def goals (get-matrices [:for] :pts))
(plot-it goals)
(def goal-theta (reg-epl goals))
(def goal-lm (find-lm goals))
(println "goals-points: " goal-theta)
(println "goals-points (incanter): " goal-lm)

Looking at the graph, while not quite as sharp as the goals-points graph, it definitely looks like scoring more goals earns you more points.



To double check my function, I also used Incanter’s linear-model function to also generate an intercept and slope. (And yes, I am relieved that they match).

goals-points:   A 1x2 matrix
 -------------
 2.73e+00  9.81e-01

goals-points (incanter):  [2.7320304686089685 0.9806635460888629]

We can superimpose the line from our regression formula on the graph, to see how they fit together.

(def goal-plot (scatter-plot (first goals) (second goals)))
(defn plot-fn [x]
  (+ (* (second goal-theta) x) (first goal-theta)))
(def plot-with-regression (add-function goal-plot plot-fn 0 100))

(view plot-with-regression)



The Answer
We can calculate how many points we would expect the team to earn based on their 86 goals in 2012-13 and 64 goals in 2013-14.

(println "86 goals = " (+ (first goal-theta)
                          (* (second goal-theta) 86)))

(println "64 goals = " (+ (first goal-theta)
                          (* (second goal-theta) 64)))

86 goals =  87.07011197597255
64 goals =  65.49481001604704

In the last year under Sir Alex, Manchester United earned 89 points, 2 more than the formula predicts. In their year under David Moyes, they earned 64 points, 1.5 less than the formula predicts. 

Of the 25 point decline in Manchester United’s results, 21.5 points can be attributed to the failure of the offense under Moyes, and 3.5 points can be attributed to bad luck or other factors. 

Manchester United’s attacking style isn’t just fun to watch, it is also the reason they win so much. Hopefully the team’s owners have learned that lesson, and will stick to attack minded managers in the future.

You can find all of the code for the project on github.
          ML Class Notes: Lesson 1 - Introduction        

I am taking the Machine Learning class at Coursera. These are my notes on the material presented by Professor Ng.

The first lesson introduces a number of concepts in machine learning. There is no code to show until the first algorithm is introduced in the next lesson.

Machine learning grew out of AI research. It is a field of study that gives computers the ability to learn algorithms and processes that can not be explicitly programmed. Computers could be programmed to do simple things, but doing more complicated things required the computer learn itself. A well posed learning program is said to learn some task if its performance improves with experience.

Machine Learning is used for a lot of things including data mining in business, biology and engineering; performing tasks that can't be programmed by hand like piloting helicopters or computer vision; self-customizing programs like product recommendations; and as a model to try to understand human learning.

Two of the more common categories of machine learning algorithms are supervised and unsupervised learning. Other categories include reinforcement learning and recommender systems, but they were not described in this lesson.

Supervised Learning

In supervised learning the computer taught to make predictions using a set of examples where the historical result is already known. One type of supervised learning tasks is regression where the predicted value is in a continuous range (the example given was predicting home prices). Other supervised learning algorithms perform classification where examples are sorted into two or more buckets (the examples given were of email, which can be spam or not spam; and tumor diagnosis which could be malignant or benign.)

Unsupervised Learning

In unsupervised learning, the computer must teach itself to perform a task because the "correct" answer is not known. A common supervised learning task is clustering. Clustering is used to group data points into different categories based on their similarity to each other. Professor Ng gave the the example of Google News, which groups related news articles, allowing you to select accounts of the same event from different news sources.

The unsupervised learning discussion ended with a demonstration of an algorithm that had been used to solve the "cocktail party problem", where two people were speaking at the same time in the same room, and were recorded by two microphones in different parts of the room. The clustering algorithm was used to determine which sound signals were from each speaker. In the initial recordings, both speakers could be heard on both microphones. In the sound files produced by the learning algorithm, each output has the sound from one speaker, with the other speaker almost entirely absent.


          Take the Machine Learning Class at Coursera        

Coursera is offering its Machine Learning course again, beginning March 8, and I highly recommend it. You already know the obvious, that it is a course on an incredibly timely career skill and it is free, but until you take the course you can't know just how good the course really is.

You will learn how to write algorithms to perform linear regression, logistic regression, neural networks, clustering and dimensionality reduction. Throughout the course Professor Ng explains the techniques that are used to prepare data for analysis, why particular techniques are used, and how to determine which techniques are most useful for a particular problem.

In addition to the explanation of what and why, there is an equal amount of explaining how. The 'how' is math, specifically linear algebra. From the first week to the last, Ng clearly explains the mathematical techniques and equations that apply to each problem, how the equations are represented with linear algebra, and how to implement each calculation in Octave or Matlab.

The course has homework. Each week, there is a zip file that contains a number of incomplete matlab files that provide the structure for the problem to be solved, and you need to implement the techniques from the week's lessons. Each assignment includes a submission script that is run from the command line. You submit your solution, and it either congratulates you for getting the right answer, or informs you if your solution was incorrect.

It is possible to view all of the lectures without signing up for the class. Don't do that. Sign up for the class. Actually signing up for the class gives you a schedule to keep to. It also allows you to get your homework checked. When you watch the lectures, you will think you understand the material; until you have done the homework you really don't. As good as the teaching is, the material is still rigorous enough that it will be hard to complete if you are not trying to keep to a schedule. Also, if you complete the course successfully, you will be able to put it on your resume and LinkedIn profile.

You have the time. When I took the class, there was extra time built in to the schedule to allow people who started the course late to stay on pace. Even if you fall behind, the penalty for late submission is low enough that it is possible to complete every assignment late and still get a passing grade in the course.

I am going to take the course again. I want to make review the material. I also want to try to implement the homework solutions in Clojure, in addition to Octave. I will be posting regularly about my progress.

You may also be able to find a study group in your area. I decided to retake the course when I found out that there was going to be a meetup group in my area. Even without a local group, the discussion forums are a great source of help throughout the class. The teaching assistants and your classmates provide a lot of guidance when you need it.


          Learning functional programming at Coursera        

I am currently taking Martin Odersky's course Functional Programming Principles in Scala on Coursera. This is my first time taking a course from Coursera. At the same time I signed up for this course, I also signed up for a course on Reactive Programming that Odersky will be teaching with Erik Meijer and Roland Kuhn beginning November 4.

There are hundreds of courses available on all sorts of subjects like humanities, science, engineering, and of course computer science, and all are free. In addition to the Scala course, I have started taking a machine learning course. Its format is the same as the Scala course, so I am going to assume the format is standard. (The machine learning course was the class that launched Coursera, which is another reason to think it is the standard.)

Each week new video lectures are posted. Lectures are typically 10 to 15 minutes long, and the total amount of material each week is 1.5 to 2 hours. There has been a programming assignment each of the first 4 weeks. An extra week was provided for the 4th assignment, and after watching the week 5 lectures, it was clear that the assignment covered material from both weeks.

After completing each assignment, it is submitted by using a 'submit' command in Scala's Simple Build Tool. After a couple of minutes, you can go to the assignment page on the course website and see your grade. 80% of each grade comes from passing automated tests, and 20% comes from a style checker, which will take points for using mutable state or nulls. You can submit each assignment up to 5 times with only the highest score being counted. (After that you can continue to submit the assignment, and you will receive the same feedback on your work, but you will not get credit for it.) You need to achieve an average of 70% on the homework to receive a certificate of completion for the course.

I really enjoy the format of the lectures. Some of the time Odersky is talking in front of the camera, but most of the time there are slides up on the screen. He is able to write on the slide. The translucent image of his head as he leans over a slide, or his hand and pen as he writes is a really minor feature that somehow makes the video more interesting to watch. From time to time, the video is paused while a question appears on the screen. Some questions are multiple choice and you submit an answer before moving on. Others are open ended (how would you right a function that…) and you are left to try it on your own, but there is nothing to submit before you hit continue. Odersky then proceeds to provide a complete explanation of the solution.

The quality of the teaching is excellent. The course builds a foundation by teaching the substitution method of function evaluation (which if I had learned before, I have forgotten it), then moves on to recursion, higher order functions and currying. Because Scala is a hybrid functional/object oriented language, there has also been a lot of discussion of object hierarchies and Scala's type system. Pattern matching, tuples and lists have also been covered.

I have found all of the assignments to be challenging. The format is great. You download a zip file that contains a skeleton for the solution and a series of test cases. The tests don't cover the whole assignment but they provide a good start, and give guidance on how to write additional tests. The first week I spent a lot of time, because I decided to read Scala for the impatient until I knew enough syntax to solve the problem. (It would have been faster if lists had been covered before chapter 13). After that, I would estimate that I have spent 6 or 7 hours per week on the assignments.

I believe that I am learning the material better through the course than I would reading a book. I have a tendency when reading a book to skim parts that don't interest me as much, or somehow I think aren't relevant to things I am likely to do. Also, the graded homework mean that I have to stick to a problem until I get it right, rather than until I think I know what I am doing.

I did have a little apprehension at first because the course assumes that you are going to be working with Eclipse, which I have just never really gotten the feel for. I remembered setting up Scala, SBT and Eclipse to be challenging. The course provided clear written instructions and video instructions for installing all of the necessary tools, with all of the appropriate download links.

The workload is not trivial, but I highly recommend taking classes at Coursera. The teaching is excellent. The variety of courses is amazing. I am very grateful to them for making such wonderful resources available for free.


          437 Digital Marketers Went Head-to-Head with a Conversion-Predicting Machine — Who Reigns Supreme?        
At the fourth annual Call to Action Conference, 437 savvy digital marketers went head-to-head with a machine learning algorithm to predict which landing pages converted above and which converted below industry averages. Who reigns supreme? The answer provides us with a glimpse into the future of digital marketing and how artificial intelligence will help us become smarter, more effective marketers.
          Datanauts 059: The Machine Learning Hype Cycle        
The Datanauts dive into machine learning--how does it work, what will it do for us, and what are common use cases--with guest Ed Henry, Senior Machine Learning Engineer at Brocade. The post Datanauts 059: The Machine Learning Hype Cycle appeared first on Packet Pushers.
          Amazon Research Awards Call For Proposals        
The Amazon Research Awards (ARA) program has opened a call for proposals for the 2017 round of awards in a number of areas including computer vision, general AI, knowledge management and data quality, machine learning, machine translation, and natural language understanding. For the full list of topics, please see this website. The program is open to faculty members in North America and Europe and awards up to $80,000 in cash and $20,000 AWS promotional credits. Submission Requirements Project proposals should be a maximum of 4 pages (single column, minimum 10 pt font), plus 1 page for references, plus CV. All content components (proposal, references, CV) should be composed into a single PDF file. […]
          Welltory packs a lot of science into its app to measure your stress levels        
 There’s a lot of talk about the quantified self, but one of the grey areas remains working out your levels of stress. Usually this requires hardware devices. Now a New York based startup thinks it’s come up with an approach based on a specially developed algorithms and machine learning using simple heartbeat readings taken with a smartphone app. Welltory (iOS, Android) has also… Read More
          Part 5 – The right to be forgotten (EU GDPR)s        
This is the fifth part of series of blog posts on 'How the EU GDPR will affect the use of Machine Learning' Article 17 is titled Right of Erasure (right to be forgotten) allows a person to obtain their data and for the data controller to ensure that the personal data is erased without any […]
           A machine learning investigation of a beta-carotenoid dataset         
Revett, Kenneth (2008) A machine learning investigation of a beta-carotenoid dataset. In: Granular computing: at the junction of rough sets and fuzzy sets. Studies in fuzziness and soft computing (224). Springer, Berlin / Heidelberg, pp. 211-227. ISBN 9783540769729
          How I'm fighting bias in algorithms | Joy Buolamwini        
MIT grad student Joy Buolamwini was working with facial analysis software when she noticed a problem: the software didn't detect her face -- because the people who coded the algorithm hadn't taught it to identify a broad range of skin tones and facial structures. Now she's on a mission to fight bias in machine learning, a phenomenon she calls the "coded gaze." It's an eye-opening talk about the need for accountability in coding ... as algorithms take over more and more aspects of our lives.
          Chris Crawford’s Famous Dragon Speech and Interview From 2008        
Chris Crawford recently highlighted a “cleaned-up” version of his famous, game industry-defining “Dragon Speech” from GDC 1992.  In this speech, Chris Crawford explained his dream of of true “interactivity” and how the game industry and he had parted-ways.    While he was almost universally rejected at the time, in an era of machine learning, A.I. […]
          ARM's latest CPUs are ready for an AI-powered future        

ARM processor technology already powers many of the devices you use every day, and now the company is showing off its plans for the future with DynamIQ. Aimed squarely at pushing the artificial intelligence and machine learning systems we're expecting to see in cars, phones, gaming consoles and everything else, it's what the company claims is an evolution on the existing "big.Little" technology.

Originally unveiled in 2011, that design allowed for multicore CPU designs with powerful, power-hungry chips to do the heavy lifting tethered to smaller, low-power chips that could handle background processing when a device is idle. It's why your phone can edit HD or even 4K video at one moment before sleeping throughout the night without losing all of the battery's charge. DynamIQ lays out a strategy for processors that combine cores specifically designed for whatever task is needed.

Source: ARM Blog


          Industry Efforts to Censor Pro-Terrorism Online Content Pose Risks to Free Speech        

In recent months, social media platforms—under pressure from a number of governments—have adopted new policies and practices to remove content that promotes terrorism. As the Guardian reported, these policies are typically carried out by low-paid contractors (or, in the case of YouTube, volunteers) and with little to no transparency and accountability. While the motivations of these companies might be sincere, such private censorship poses a risk to the free expression of Internet users.

As groups like the Islamic State have gained traction online, Internet intermediaries have come under pressure from governments and other actors, including the following:

  • the Obama Administration;
  • the U.S. Congress in the form of legislative proposals that would require Internet companies to report “terrorist activity” to the U.S. government;
  • the European Union in the form of a “code of conduct” requiring Internet companies to take down terrorist propaganda within 24 hours of being notified, and via the EU Internet Forum;
  • individual European countries such as the U.K., France and Germany that have proposed exorbitant fines for Internet companies that fail to take down pro-terrorism content; and,
  • victims of terrorism who seek to hold social media companies civilly liable in U.S. courts for providing “material support” to terrorists by simply providing online platforms for global communication.

One of the coordinated industry efforts against pro-terrorism online content is the development of a shared database of “hashes of the most extreme and egregious terrorist images and videos” that the companies have removed from their services. The companies that started this effort—Facebook, Microsoft, Twitter, and Google/YouTube—explained that the idea is that by sharing “digital fingerprints” of terrorist images and videos, other companies can quickly “use those hashes to identify such content on their services, review against their respective policies and definitions, and remove matching content as appropriate.”

As a second effort, the same companies created the Global Internet Forum to Counter Terrorism, which will help the companies “continue to make our hosted consumer services hostile to terrorists and violent extremists.” Specifically, the Forum “will formalize and structure existing and future areas of collaboration between our companies and foster cooperation with smaller tech companies, civil society groups and academics, governments and supra-national bodies such as the EU and the UN.” The Forum will focus on technological solutions; research; and knowledge-sharing, which will include engaging with smaller technology companies, developing best practices to deal with pro-terrorism content, and promoting counter-speech against terrorism.

Internet companies are also taking individual measures to combat pro-terrorism content. Google announced several new efforts, while both Google and Facebook have committed to using artificial intelligence technology to find pro-terrorism content for removal.

Private censorship must be cautiously deployed

While Internet companies have a First Amendment right to moderate their platforms as they see fit, private censorship—or what we sometimes call shadow regulation—can be just as detrimental to users’ freedom of expression as governmental regulation of speech. As social media companies increase their moderation of online content, they must do so as cautiously as possible.

Through our project Onlinecensorship.org, we monitor private censorship and advocate for companies to be more transparent and accountable to their users. We solicit reports from users of when Internet companies have removed specific posts or other content, or whole accounts.

We consistently urge companies to follow basic guidelines to mitigate the impact on users’ free speech. Specifically, companies should have narrowly tailored, clear, fair, and transparent content policies (i.e., terms of service or “community guidelines”); they should engage in consistent and fair enforcement of those policies; and they should have robust appeals processes to minimize the impact on users’ freedom of expression.

Over the years, we’ve found that companies’ efforts to moderate online content almost always result in overbroad content takedowns or account deactivations. We, therefore, are justifiably skeptical that the latest efforts by Internet companies to combat pro-terrorism content will meet our basic guidelines.

A central problem for these global platforms is that such private censorship can be counterproductive. Users who engage in counter-speech against terrorism often find themselves on the wrong side of the rules if, for example, their post includes an image of one of more than 600 “terrorist leaders” designated by Facebook. In one instance, a journalist from the United Arab Emirates was temporarily banned from the platform for posting a photograph of Hezbollah leader Hassan Nasrallah with a LGBTQ pride flag overlaid on it—a clear case of parody counter-speech that Facebook’s content moderators failed to grasp.

A more fundamental problem is that having narrow definitions is difficult. What counts as speech that “promotes” terrorism? What even counts as “terrorism”? These U.S.-based companies may look to the State Department’s list of designated terrorist organizations as a starting point. But Internet companies will sometimes go further. Facebook, for example, deactivated the personal accounts of Palestinian journalists; it did the same thing for Chechen independence activists under the guise that they were involved in “terrorist activity.” These examples demonstrate the challenges social media companies face in fairly applying their own policies.

A recent investigative report by ProPublica revealed how Facebook’s content rules can lead to seemingly inconsistent takedowns. The authors wrote: “[T]he documents suggest that, at least in some instances, the company’s hate-speech rules tend to favor elites and governments over grassroots activists and racial minorities. In so doing, they serve the business interests of the global company, which relies on national governments not to block its service to their citizens.” The report emphasized the need for companies to be more transparent about their content rules, and to have rules that are fair for all users around the world.

 Artificial intelligence poses special concerns

 We are concerned about the use of artificial intelligence automation to combat pro-terrorism content because of the imprecision inherent in systems that automatically block or remove content based on an algorithm. Facebook has perhaps been the most aggressive in deploying AI in the form of machine learning technology in this context. The company’s latest AI efforts include using image matching to detect previously tagged content, using natural language processing techniques to detect posts advocating for terrorism, removing terrorist clusters, removing new fake accounts created by repeat offenders, and enforcing its rules across other Facebook properties such as WhatsApp and Instagram.

This imprecision exists because it is difficult for humans and machines alike to understand the context of a post. While it’s true that computers are better at some tasks than people, understanding context in written and image-based communication is not one of those tasks. While AI algorithms can understand very simple reading comprehension problems, they still struggle with even basic tasks such as capturing meaning in children’s books. And while it’s possible that future improvements to machine learning algorithms will give AI these capabilities, we’re not there yet.

Google’s Content ID, for example, which was designed to address copyright infringement, has also blocked fair uses, news reporting, and even posts by copyright owners themselves. If automatic takedowns based on copyright are difficult to get right, how can we expect new algorithms to know the difference between a terrorist video clip that’s part of a satire and one that’s genuinely advocating violence?

Until companies can publicly demonstrate that their machine learning algorithms can accurately and reliably determine whether a post is satire, commentary, news reporting, or counter-speech, they should refrain from censoring their users by way of this AI technology.

Even if a company were to have an algorithm for detecting pro-terrorism content that was accurate, reliable, and had a minimal percentage of false positives, AI automation would still be problematic because machine learning systems are not robust to distributional change. Once machine learning algorithms are trained, they are as brittle as any other algorithm, and building and training machine learning algorithms for a complex task is an expensive, time-intensive process. Yet the world that algorithms are working in is constantly evolving and soon won’t look like the world in which the algorithms were trained.

This might happen in the context of pro-terrorism content on social media: once terrorists realize that algorithms are identifying their content, they will start to game the system by hiding their content or altering it so that the AI no longer recognizes it (by leaving out key words, say, or changing their sentence structure, or a myriad of other ways—it depends on the specific algorithm). This problem could also go the other way: a change in culture or how some group of people express themselves could cause an algorithm to start tagging their posts as pro-terrorism content, even though they’re not (for example, if people co-opted a slogan previously used by terrorists in order to de-legitimize the terrorist group).

We strongly caution companies (and governments) against assuming that technology will be the panacea in identifying pro-terrorism content, because this technology simply doesn’t yet exist.

Is taking down pro-terrorism content actually a good idea?

Apart from the free speech and artificial intelligence concerns, there is an open question of efficacy. The sociological assumption is that removing pro-terrorism content will reduce terrorist recruitment and community sympathy for those who engage in terrorism. In other words, the question is not whether terrorists are using the Internet to recruit new operatives—the question is whether taking down pro-terrorism content and accounts will meaningfully contribute to the fight against global terrorism.

Governments have not sufficiently demonstrated this to be the case. And some experts believe this absolutely not to be the case. For example, Michael German, a former FBI agent with counter-terrorism experience and current fellow at the Brennan Center for Justice, said, “Censorship has never been an effective method of achieving security, and shuttering websites and suppressing online content will be as unhelpful as smashing printing presses.” In fact, as we’ve argued before, censoring the content and accounts of determined groups could be counterproductive and actually result in pro-terrorism content being publicized more widely (a phenomenon known as the Streisand Effect).

Additionally, permitting terrorist accounts to exist and allowing pro-terrorism content to remain online, including that which is publicly available, may actually be beneficial by providing opportunities for ongoing engagement with these groups. For example, a Kenyan government official stated that shutting down an Al Shabaab Twitter account would be a bad idea: “Al Shabaab needs to be engaged positively and [T]witter is the only avenue.”

Keeping pro-terrorism content online also contributes to journalism, open source intelligence gathering, academic research, and generally the global community’s understanding of this tragic and complex social phenomenon. On intelligence gathering, the United Nations has said that “increased Internet use for terrorist purposes provides a corresponding increase in the availability of electronic data which may be compiled and analysed for counter-terrorism purposes.”

In conclusion

While we recognize that Internet companies have a right to police their own platforms, we also recognize that such private censorship is often in response to government pressure, which is often not legitimately wielded.

Governments often get private companies to do what they can’t do themselves. In the U.S., for example, pro-terrorism content falls within the protection of the First Amendment. Other countries, many of which do not have similarly robust constitutional protections, might nevertheless find it politically difficult to pass speech-restricting laws.

Ultimately, we are concerned about the serious harm that sweeping censorship regimes—even by private actors—can have on users, and society at large. Internet companies must be accountable to their users as they deploy policies that restrict content.

First, they should make their content policies narrowly tailored, clear, fair, and transparent to all—as the Guardian’s Facebook Files demonstrate, some companies have a long way to go.

Second, companies should engage in consistent and fair enforcement of those policies.

Third, companies should ensure that all users have access to a robust appeals process—content moderators are bound to make mistakes, and users must be able to seek justice when that happens.

Fourth, until artificial intelligence systems can be proven accurate, reliable and adaptable, companies should not deploy this technology to censor their users’ content.

Finally, we urge those companies that are subject to increasing governmental demands for backdoor censorship regimes to improve their annual transparency reporting to include statistics on takedown requests related to the enforcement of their content policies.


          SQL Server 2017 Integration Services Cookbook        

I coauthored my 15th bookSmile Together with Christian Cote (lead author) and Matija Lah (coauthor) we publishes SQL Server 2017 Integration Services Cookbook. Of course, it is kind of early to say this is a definitive guide to SSIS 2017. More accurate name would be SSIS 2016 / 2017 Cookbook. Besides detailed guidelines how to use the 2016 version, you will also find a chapter on some new information on scaling out SSIS 2017. In the future, we will add an online chapter, if it will be needed, about additional new SSIS 2017 functionalities. Anyway, here is a brief description of the chapters.

Chapter 1: SSIS Setup

This chapter will describe step by step how to setup SQL Server 2016 to get the features that are used in the book.

Chapter 2: What is New in SSIS 2016

This chapter is an overview of Integration Services 2016 new features. Some of the topics covered here are covered extensively later in the book.

Chapter 3: Key Components of a Modern ETL Solution

This chapter will explain how ETL has evolved over the past few years and will explain what components are necessary to get a modern scalable ETL solution that fits the modern data warehouse.

Chapter 4: Data Warehouse Loading Techniques

This chapter describes many patterns used when it comes to data warehouse (DW) or operational data store (ODS) load.

Chapter 5: Dealing with Data Quality

This chapter will describe how SSIS, DQS and MDS can be leveraged to validate, cleanse, maintain, and load data.

Chapter 6: SSIS Performance and Scalability

This chapter talks about how to monitor SSIS package execution. It also provides solutions to scale out processes by using parallelism. Readers learn how to identify bottlenecks and how to resolve them using various techniques.

Chapter 7: Unleash the Power of SSIS Script Task and Component

Readers learn how script tasks and script components are very valuable in many situations to overcome the limitations of stock toolbox tasks and transforms.

Chapter 8: SSIS and Advanced Analytics

This chapter talks about using SSIS to prepare data for and do advanced analyses like data mining, machine learning, and text mining. Readers learn how sampling components can be used for preparing the training and the test set, how to use SQL Server Analysis Services data mining models, how to execute R code inside SSIS, and how to analyze texts with SSIS.

Chapter 9: On-Premises and Azure Big Data Integration

This chapter talks about the Azure Feature pack that allows SSIS to integrate Azure data from blob storage and HDInsight clusters. Readers learn how to use Azure feature pack components to add flexibility to their SSIS solution architecture.

Chapter 10: Extending SSIS Custom Task and Transformations

This chapter talks about extending and customize the toolbox using custom developed tasks and transforms.

Chapter 11: Scale Out with SSIS 2017

The last chapter is dedicated to SSIS 2017 and teaches you how to scale out SSIS package executions on multiple servers.

Enjoy the reading!


          Embrace R @ SQL Nexus 2017 & SQL Saturday #626        

R is the hottest topic in SQL Server 2016. If you want to learn how to use it for advanced analytics, join my seminar at SQL Nexus conference on my 1st in Copenhagen. Although there is still nearly a month before the seminar, there are less than half places still available. You are also very welcome to visit my session Using R in SQL Server, Power BI, and Azure ML during the main conference.

For beginners, I have another session in the same week, just this time in Budapest. You can join me at the Introducing R session on May 6th at SQL Saturday #626 Budapest.

Here is the description of the seminar.

As being an open source development, R is the most popular analytical engine and programming language for data scientists worldwide. The number of libraries with new analytical functions is enormous and continuously growing. However, there are also some drawbacks. R is a programming language, so you have to learn it to use it. Open source development also means less control over code. Finally, the free R engine is not scalable.

Microsoft added support for R code in SQL Server 2016 and, Azure Machine Learning, or Azure ML, and in Power BI. A parallelized highly scalable execution engine is used to execute the R scripts. In addition, not every library is allowed in these two environments.

Attendees of this seminar learn to program with R from the scratch. Basic R code is introduced using the free R engine and RStudio IDE. Then the seminar shows some more advanced data manipulations, matrix calculations and statistical analysis together with graphing options. The mathematics behind is briefly explained as well. Then the seminar switches more advanced data mining and machine learning analyses. Attendees also learn how to use the R code in SQL Server, Azure ML, and create SQL Server Reporting Services (SSRS) reports that use R.

  • The seminar consists of the following modules:
  • Introduction to R
  • Data overview and manipulation
  • Basic and advanced visualizations
  • Data mining and machine learning methods
  • Scalable R in SQL Server
  • Using R in SSRS, Power BI, and Azure ML

Hope to see you there!


          Data Mining Algorithms – Logistic Regression        

It’s been awhile since I wrote the last blog on the data mining / machine learning algorithms. I described the Neural Network algorithm. In addition, it is a good time to write another post in order to remind the readers of the two upcoming seminars about the algorithms I have in Oslo, Friday, September 2nd, 2016, and in Cambridge, Thursday, September 8th. Hope to see you in one of the seminars. Finally, to conclude this marketing part: if you are interested in the R language, I am preparing another seminar “EmbRace R”, which will cover R from basics to advanced analytics. Stay tuned.

Now for the algorithm. If you remember the post, a Neural network has an input, an output, and one or more hidden layers. The Neural Network algorithm uses the hyperbolic tangent activation function in the hidden layer and the sigmoid function in output layer. However, the Sigmoid function is called the Logistic function as well. Therefore, describing the Logistic Regression algorithm is simple after I described the Neural Network. If a neural network has only input neurons that are directly connected to the output neurons, it is a Logistic Regression. Or, to repeat the same thing in a different way: Logistic Regression is Neural Network with zero hidden layers.

This was quick:-) To add more meat to the post, I am adding the formulas and the graphs for the hyperbolic tangent and sigmoid functions.

image


          Re: Machine Learning For Investing In Consumer Goods Startups        

Interesting. I wonder what specific data sources they are using.

Google could so some crazy things in this area with search, maps, and potentially even Gmail data.


           Just Buying Into Modern BI and Analytics? Get Ready for Augmented Analytics, the Next Wave of Market Disruption         
Machine learning automation is affecting all of enterprise software, but will completely transform how we build, analyze, and consume data and analytics.Over the past 10 years or more, visual-based...
           Can we Trust “Black Box” Machine Learning when it comes to Security or is there a Better Way?         
Machine learning is relatively new to security. It first went mainstream a few years ago in a few security domains such as UEBA, network traffic analytics and endpoint protection. Several...
          [raspberry-python] Readings in Programming        

"Ex-Libris" part IV: Code


I've made available part 4 of my "ex-libris" of a Data Scientist. This one is about code. 

No doubt, many have been waiting for the list that is most related to Python.  In a recent poll by KDNuggets, the top tool used for analytics, data science and machine learning by respondents turned out to also be a programming language: Python.

The article goes from algorithms and theory, to approaches, to the top languages for data science, and more. In all, almost 80 books in just that part 4 alone. It can be found on LinkedIn:

"ex-libris" of a Data Scientist - Part IV

from Algorithms and Automatic Computing Machinesby B. A. Trakhtenbrot




See also


Part I was on "data and databases": "ex-libris" of a Data Scientist - Part i

Part II, was on "models": "ex-libris" of a Data Scientist - Part II



Part III, was on "technology": "ex-libris" of a Data Scientist - Part III

Part V will be on visualization, part VI on communication. Bonus after that will be on management / leadership.

Francois Dion
@f_dion

P.S.
Je vais aussi avoir une liste de publications en francais
En el futuro cercano voy a hacer una lista en espanol tambien

          [PyConFr2016] Appel à conférences et ateliers        

Édition du 04 août 2016 : l'appel à conférences et ateliers est désormais fermé.

tl;dr [1]: L'appel à conférences et ateliers est maintenant ouvert, n'hésitez pas à proposer quelque chose (lien supprimé) !!

Depuis 2007 grâce à l'AFPy, les utilisateurs francophones du langage Python se retrouvent le temps de quelques jours pour échanger autour de leurs expériences, apprendre les uns des autres et se présenter leurs dernières trouvailles au cours d'ateliers, de conférences et de rencontres.

Venir découvrir

La PyCon-fr est le meilleur moyen de découvrir le langage Python, d'aller plus loin dans son utilisation, de rencontrer les auteurs de bibliothèques que vous utilisez peut-être tous les jours... et tout simplement de se retrouver le temps d'un week-end. La PyCon-fr, c'est 300 visiteurs en moyenne chaque jour, et pas moins de 70 conférences et ateliers :

  • Les conférences, de tous niveaux, permettent de découvrir différents usages de Python,
  • Les "sprints" (ateliers auto-organisés de programmation) permettent de faire avancer des projets libres et Open Source.

À titre d'exemple, l'an dernier, les associations Bibliothèque Sans-Frontières et l'OCA ont pu bénéficier de l'aide de codeurs débutants et chevronnés.

Cette année, nous serons toutes et tous réunis à Rennes, dans les locaux de Télécom Bretagne, du 13 au 16 octobre 2016.

Les sprints auront lieu les jeudi 13 et vendredi 14 Octobre.

Les conférences et ateliers se dérouleront samedi 15 et dimanche 16 Octobre.

Votre conférence

Vous avez une expérience autour de Python à partager ?

Vous souhaitez présenter votre dernier projet à la communauté ?

Demander de l'aide et/ou exposer vos doutes ? C'est le bon moment.

L'appel à oratrices et orateurs est ouvert jusqu'au 31 juillet !

Voici quelques suggestions de thèmes issus des éditions précédentes :

  • Python dans l'éducation : trucs et astuces pour débuter ou enseigner avec Python
  • Internet, le Web, la montée en charge et Python
  • Clguba et la crypto : chiffrement et vie privée
  • Python scientifique : calcul scientifique et statistique, machine Learning
  • Au cœur de Python : packaging, librairies, tests, profiling, bindings
  • Autour de Python : Provisioning, Bases de données, Framework Javascript
  • Python dans le réel : Fabrication numérique (impression 3D, CNC, IoT, ...)
  • Python dans le futur : Pypy, Python3 et asyncio
  • Le libre avec Python: vos créations
  • Et surtout, toutes les propositions ne rentrant pas dans ces cases ;)

Que vous soyez une utilisatrice chevronnée ou simplement à la découverte de Python, n'hésitez pas à proposer un sujet : PyCon-fr, c'est avant tout vous :)

Nous acceptons des présentations longues (45mn) et courtes (25mn) et des ateliers (à vous de nous préciser leur durée en fonction du besoin).

Faites-nous vos propositions de talks / ateliers (lien supprimé)

Attention, la date limite est fixée au 31 juillet !

En espérant crouler sous les propositions,

L'équipe d'organisation

Pierre, Mathieu, Yann, Alexis et Rémy

[1]too long, didn't read

          Machine Learning: Making Customer Service Operations Smarter and More Strategic        
Excellent customer service operations are essential to ensure customer retention. To help companies keep up with customers’ growing expectations, SAP Service Ticket Intelligence automatically categorizes...
          How Machine Learning Helps Swarovski Fix Your Crystal Teddy Bear        
Swarovski crystals are loved by many around the world. With countless high-end designer and retail partners, they adorn everything from evening gowns to cell phone...
          How to Benefit from Upgrading Your Digital Mindset        
More technologies are simultaneously reaching maturity than at any other time in recent memory. Getting the most out of cloud, mobile, big data, IoT, machine learning, artificial intelligence and other maturing technologies will require organizations to open themselves to new ways of thinking.
          Facebook’s Latest Move to Fight Fake News Might Finally Be the Right One        

Facebook may have finally hit on a promising way to fight its “fake news” problem.

The company on Thursday announced that it is launching a feature called Related Articles, which it has been testing since April. Now, when you see certain controversial or hotly debated stories in your news feed, below them will appear a series of headlines from other publishers on the same topic.

In its April blog post explaining the test, Facebook presented Related Articles as a way to give users “easier access to additional perspectives and information, including articles by third-party fact checkers.” It gave as an example an article about “a new medical advancement,” suggesting that the related stories would help readers evaluate whether the piece their friend shared was accurate or misleading in its presentation of the findings.

Now, it seems, Facebook is comfortable pitching the feature more explicitly as a tool to counteract the spread of misinformation. In an update to that April blog post Thursday, it wrote:

Since starting this test, we’ve heard that Related Articles helps give people more perspectives and additional information, and helps them determine whether the news they are reading is misleading or false. So we’re rolling this out more broadly.
Now, we will start using updated machine learning to detect more potential hoaxes to send to third-party fact checkers. If an article has been reviewed by fact checkers, we may show the fact checking stories below the original post. In addition to seeing which stories are disputed by third-party fact checkers, people want more context to make informed decisions about what they read and share. We will continue testing updates to Related Articles and other ongoing News Feed efforts to show less false news on Facebook and provide people context if they see false news.

When Facebook says “false news,” it’s referring at least in part to what became popularly known as “fake news” during the 2016 U.S. presidential election. Much of that “fake news”—an ill-defined category that seemed to include everything from deliberate hoaxes to mainstream news stories that some perceived as biased or misleading—revolved around politics and catered to the partisan viewpoints of one group or another. That made it a particularly thorny problem for Facebook, which risked being tarred as politically slanted if it flagged or suppressed posts based on the judgment of its own editors or software engineers.

Facebook was so reluctant to wade into the murky waters of editorial judgment that it first denied “fake news” was a real problem. When that backfired, the company began looking for ways to tackle it in earnest. One of its first major initiatives was a partnership with third-party fact-checkers, in which Facebook would flag potentially false or misleading posts as “disputed” in users’ feeds. The company relied in part on its own users to report those posts, which it could then pass on to the fact-checking organizations for careful vetting.

That approach, cautious as it was, still left Facebook open to charges of bias and even censorship (although that’s a misapplication of the term) from those who took issue with the fact-checkers’ conclusions. The process is also labor-intensive, meaning that only a small fraction of misleading stories would likely be flagged as such in a timely manner. Even if Facebook overcame that problem, psychologists doubted the efficacy of the approach.

Related Articles won’t singlehandedly solve the fake news problem, either. But there is at least some academic research suggesting that it could make a real difference in readers’ perceptions. Just as importantly, from Facebook’s standpoint, it should insulate the company from cries of censorship, since surrounding a story with related articles doesn’t necessarily imply any editorial judgment about its credibility. A reader’s understanding of just about any story could benefit from additional context, so there’s little danger in “false positives,” as there is when you’re flagging an article as disputed.

This still leaves the deeper problem of the biases embedded in the very structure of the news feed. But it’s a sensible measure nonetheless, and one that suggests Facebook is capable of applying its employees’ bright minds to a societal problem broader than increasing users' engagement or monetizing their data.


          Microsoft Word’s Grammar and Style Tools Will Make Your Writing Worse        

Microsoft Word is all too easy to hate. As one of my colleagues at Slate put it in a recent conversation, the venerable program’s ubiquity makes it a bit like the cable company of the software world: You learn to loathe it precisely because you spend so much time interacting with it. And though it does many things right—track changes has only gotten better over the years, for example—a few of its most prominent qualities remain maddening, none more so than its grammar and style check features.

I use the word “features” here in the loosest possible sense. When I think of the way Word marks up writing, I sometimes imagine a drunk composition professor rushing to grade a stack of papers late at night, alternately sobbing and steaming with anger. These “corrections” create problems of their own, products of slavish rule-following that rarely reflect the practical realities of real language or the challenges it presents.

Microsoft, for what it’s worth, would probably object to this characterization. In an FAQ about Word’s grammar proofing, the company claims the tool “performs a comprehensive and accurate analysis … of the submitted text, instead of just using a series of heuristics (or pattern matching) to flag errors.” In other words, it’s saying that Word doesn’t just check whether a sentence violates a set of rules, it evaluates how the sentence works.

That would be terrific if it were true. Alas, it is not.

By way of evidence, you need only read on in the FAQ itself. A few lines down, it offers this example sentence: “The legend says that that Kingdom was created by three ancient magicians, whose magical powers governed the world and made them immortal and all-powerful.” Noting that this is a passive construction (“the Kingdom was created”), Microsoft suggests that it should be rewritten to read, “The legend says that three ancient magicians, whose magical powers governed the world and made them immortal and all-powerful, created that Kingdom.”

There are several problems with this revision. In practice, we encourage writers to avoid the passive voice because it obscures agency, making it difficult to determine who is doing what to whom. In some cases, however, that can be a good thing, most of all when you want to emphasize the grammatical object of a sentence rather than its subject. In this case, where “the Kingdom” seems to be the most important detail, such an inversion might be appropriate. Even if it isn’t, though, the revised sentence still scans poorly, since the now-active verb (“created”) is separated from the subject by a 13-word clause. Accordingly, the new phrasing remains difficult to follow, making the alteration dubiously useful at best.

In other cases, Word’s suggestions can actually introduce grammatical errors as they work to clean up a document’s style. When Microsoft announced a new “cloud-based” service called Editor in 2016, it claimed that the service’s use of “machine learning and natural language processing” would make Word’s recommendations better. By way of example, it noted that the tool would propose “‘most’ in place of ‘the majority of.’ ” Here too, though, there’s trouble, since “most” will make some sentences, including one that Microsoft shows in an accompanying image, more awkward. In fact, some of the other alternatives visible in that image would render the sentence actively ungrammatical:

If there’s one thing that makes Word’s grammar and style tools frustrating, it’s this insistence on unneccessary fixes that produce worse problems. Despite its hatred for the adverb “actually,” which it consistently tries to strike from sentences, Word’s grammar checker is the mansplainiest of all digital assistants, butting in even when it has no idea what it’s talking about. Lest you imagine that I was overselling the program’s distaste for “actually,” here’s an image showing that it sometimes flags the word even when you’re talking about the word itself:

And here’s one showing the program’s suggested revision to that sentence:

I understand the underlying impetus here. More often than not, “actually” is a linguistic crutch, deployed to introduce emphasis where none is needed. As a rule, it’s a word best used sparingly, if at all. To the extent that Word calls attention to our potential overuse, it may actually be helpful. Good writing is conscientious writing and such notifications may make us more mindful. (Even now, I find myself wondering whether I needed it in that earlier sentence.) But Word’s obsessive focus on strict standards leads me to doubt my prose far more frequently than it helps me to engage with and improve on it.

Other problems abound. When a sentence begins with the word “So,” for example, the program insists that it should be followed by a comma, even when inserting one would change the sentence’s meaning. That’s arguably evident in a screenshot provided by Slate copy chief Abby McIntyre, as is an even more inexplicable “fix” where Word proposed replacing the contraction “there’s” with “there as.”

Ultimately, it comes down to this: Microsoft’s grammar and style tools aren’t helping writers understand their work, just arbitrarily imposing anemic pedantry on them.

Having spent years teaching college-level writing courses, I understand the challenge that Microsoft faces all too well. Students often glom on to the wrong lessons, treating rules of thumb as if they were absolute dictates. Word’s approach seems to be the product of such lessons-gone-wrong. It gives you exactly the sort of advice you’d expect to get if you had a group of software developers analyze your writing. They would, I suspect, confuse well-crafted prose with half-remembered advice from a single session with a writing coach, leading to impose baroque standards on it, much as Word does.

There’s a simple solution, of course: Turn the damn thing off. On reflection, though, the real question is why Microsoft turned it on in the first place.


          Cadence @ DAC: What to Expect and What to See        

Cadence returns to DAC 2017 this year, showcasing our full verification suite. Here are some of the things you can look forward to from us in the upcoming week.

Once again, Cadence has the Expert Bar on Monday, Tuesday, and Wednesday. The Expert Bar is where engineers can visit our booth and have conversations with our technical experts. Cadence will be running many sessions, and those topics are listed below.

Topics List

Scheduled Time

Featured Products

Automotive: Functional Safety Focus

Tues 1:00-2:30, Wed 4:00-6:00

Xcelium Safety, DSG full-flow

Simplify SoC Verification with VIP

Mon 2:30-4:00, Tues 11:30-1:00

Cadence VIP

Performance Analysis and Traffic Optimization for ARM-Based SoCs

Mon 4:00-6:00, Wed 2:30-4:00

Interconnect Workbench, Palladium Z1, Xcelium simulator, vManager

Formal Verification Featuring the JasperGold Platform

Tues 2:30-4:00, Wed 2:30-4:00

JasperGold Apps

System Verification and HW/SW Co-Verification

with the Palladium Z1 Platform

Tues 1:00-2:30, Wed 10:00-11:30

Palladium Z1

Software Development with Protium S1 FPGA-Based Prototyping Platform

Tues 4:00-6:00, Wed 11:30-1:00

Protium S1

Verification Fabric: Portable Stimulus Generation Featuring Perspec System Verifier

Tues 10:00-11:30, Wed 1:00-2:30

Perspec System Verifier

High-Performance Simulation with Xcelium Parallel Simulation

Tues 11:30-1:00, Wed 1:00-2:30

Xcelium Simulator

Verification Fabric: Plan, Coverage, and Debug with vManager and Indago Solutions

Tues 1:00-2:30, Wed 4:00-6:00

vManager,, Indago

The Future of Verification with the Cadence Verification Suite

Mon 2:30-4:00

Cadence Verification Suite

Cadence Verification Implementation Solutions for ARM-Based Designs

Mon 10:00-11:30, Tues 1:00-2:30

 

 

Cadence will also be offering Tech Sessions—hour-long presentations about a singular topic. These will be held throughout DAC and cover the breadth of verification as listed below:

Topics List

Scheduled Time

Featured Products

Finding More Bugs Earlier in IP Verification by Integrating Formal Verification with UVM

Mon 3:30-4:30

JasperGold Apps, Verification IP, Xcelium Single-Core Simulator

High-Speed SoC Verification Leveraging Portable Stimulus with Multi-Core Simulation and Hardware Acceleration

Tues 2:30-3:30

Perspec System Verifier, Xcelium Multi-Core Simulator, Palladium Z1

Optimally Balancing FPGA-Based Prototyping and Emulation for Verification, Regressions, and Software Development

Wed 10:30-11:30

Palladium Z1, Protium S1, Palladium Hybrid

Automotive Functional Safety Verification

Tues 3:30-4:30

Xcelium Safety

RTL Designer Signoff with JasperGold Superlint and CDC Apps

Tues 12:30-1:30, Wed 11:30-12:30

JasperGold Apps

Cadence Verification Suite: Core Engines, Fabric Technologies, and Solutions

Wed 2:30-3:30

Cadence Verification Suite

In addition to these presentations, Cadence will be hosting a verification luncheon that offers a panel of experts from a variety of different companies to answer verification-related questions. In Monday’s luncheon, Cadence will share a table with Vista Ventures LLC, Hewett Packard Enterprise, and Intel to discuss “Towards Smarter Verification”—a panel asserting that the next big change in verification technology is not necessarily a new engine, but improved communication and compatibility between existing engines that may be optimized for different tasks. This panel will talk about how verification is changing in today’s application-specific world, as well as utilizing machine learning technology to assist in data analytics, among other topics.

Cadence technology experts will also be holding other events during DAC. Of chief importance are “Tutorial 8: An Introduction to the Accellera Portable Stimulus Standard,” presented by Sharon Rosenberg in room 18CD on Monday from 1:30pm to 3:00pm. The Designer/IP Track Poster Session regarding “Automating Generation of System Use Cases Using Model-Based Portable Stimulus Approach” is another important event, presented by Frederik Kautz, Christian Sauer, and Joerg Simon from 5:00pm to 6:00pm on the Exhibit Floor.

We have many exciting things in store for those who attend, and we hope to see you all at DAC this week!


          Venkatesh Saligrama        
Our webpage has moved to http://sites.bu.edu/data/ I run the the Data Science & Machine Learning laboratory at Boston University. The lab is involved in projects related to Machine Learning, Vision & Learning, Structured Signal Processing, and Decision & Control. The laboratory is led by Prof. Venkatesh Saligrama. In the area of machine learning recent research projects […]
          Data Scientist - Maths Modelling, Python, Forecasting, Machine Learning techniques, Cambridge, to 45k DoE: ECM SELECTION        
£Negotiable: ECM SELECTION
For more latest jobs and jobs in London & South East England visit brightrecruits.com
           Modeling cognitive development on balance scale phenomena         
Shultz, T.R. and Mareschal, Denis and Schmidt, W.C. (1994) Modeling cognitive development on balance scale phenomena. Machine Learning 16 (1/2), pp. 57-86. ISSN 0885-6125.
          Artificial Intelligence and e-commerce: Who wins        
Robot with shopping cart. Isolated on white background
Utter the words artificial intelligence, and the robots and creatures of science fiction movies are usually the first things that spring to mind. But more and more the workings of artificial intelligence and machine learning are infiltrating our everyday lives – even if we’re not always 100% aware. Take shopping […]
          à¸¥à¸²à¸à¹ˆà¸­à¸™ Google Now กูเกิลอัพเดตแอพ Google บนมือถือพร้อมชื่อใหม่ 'Feed'        

นับตั้งแต่เปิดตัวในปี 2012 ชื่อแบรนด์ Google Now หมายถึงฟีเจอร์ผู้ช่วยส่วนตัวที่สั่งงานด้วยเสียง และฟีเจอร์แสดงข้อมูลในรูปแบบ card โดยทั้งหมดอยู่ในแอพ Google ทั้งบน Android/iOS

แต่เมื่อปีที่แล้ว กูเกิลออกแอพผู้ช่วยตัวใหม่ในแบรนด์ Google Assistant มาแทน ทำให้แบรนด์ Google Now เริ่มจางหายไป และวันนี้ กูเกิลก็อัพเดตแอพ Google ใหม่ให้มีฟีเจอร์ 'Feed' แสดงข้อมูลตามความสนใจของผู้ใช้ ส่งผลให้แบรนด์ Google Now จบสิ้นลงอย่างสมบูรณ์

กูเกิลเรียกฟีเจอร์แสดงข้อมูลตัวใหม่ว่า Feed โดยนำเทคนิค machine learning มาเรียนรู้พฤติกรรมของผู้ใช้ ว่าสนใจประเด็นหรือหัวข้อใดบ้างจากการค้นหาผ่านกูเกิล และผู้ใช้ก็สามารถกด follow หัวข้อต่างๆ เพื่อบอกให้แอพรู้ว่าเราสนใจอะไรได้เช่นกัน

อัพเดตตัวนี้ใช้ได้กับทั้ง Android และ iOS โดยเริ่มเปิดให้ใช้งานในสหรัฐแล้ว และจะทยอยเปิดให้ผู้ใช้ประเทศอื่นๆ ในอีกไม่กี่สัปดาห์ข้างหน้า

ที่มา - Google Blog, Search Engine Land

No Description


          Prázdninové čtení, co život Vám změní - 360 ebooků zdarma ke stažení        
Je poměrně časté, že v rámci produktů Microsoft je možné narazit na zajímavé ebooky, které se váží k dané technologii. A za poslední dobu jich vyšlo více jak dost. Proto je načase si udělat opět nějaký přehled toho, co je aktuálně k dispozici a že je toho opravdu hodně - konkrétně 360 knih, které si můžete zdarma stáhnout a ponořit se tak do světa pro Vás třeba zatím neobjevených technologií. Která bude Vaše nejoblíbenější?

KategoriNázevFormát
AzureIntroducing Windows Azure™ for IT ProfessionalsPDF MOBI EPUB
AzureMicrosoft Azure Essentials Azure AutomationPDF MOBI EPUB
AzureMicrosoft Azure Essentials Azure Machine LearningPDF MOBI EPUB
AzureMicrosoft Azure Essentials Fundamentals of AzurePDF MOBI EPUB
AzureMicrosoft Azure Essentials Fundamentals of Azure, Second EditionPDF
AzureMicrosoft Azure Essentials Fundamentals of Azure, Second Edition MobilePDF
AzureMicrosoft Azure Essentials Migrating SQL Server Databases to Azure – MobilePDF
AzureMicrosoft Azure Essentials Migrating SQL Server Databases to Azure 8.5X11PDF
AzureMicrosoft Azure ExpressRoute GuidePDF
AzureOverview of Azure Active DirectoryDOC
AzureRapid Deployment Guide For Azure Rights ManagementPDF
AzureRethinking Enterprise Storage: A Hybrid Cloud ModelPDF MOBI EPUB
BizTalkBizTalk Server 2016 Licensing DatasheetPDF
BizTalkBizTalk Server 2016 Management Pack GuideDOC
CloudEnterprise Cloud StrategyPDF MOBI EPUB
CloudEnterprise Cloud Strategy – MobilePDF
Developer.NET Microservices: Architecture for Containerized .NET ApplicationsPDF
Developer.NET Technology Guidance for Business ApplicationsPDF
DeveloperBuilding Cloud Apps with Microsoft Azure™: Best practices for DevOps, data storage, high availability, and morePDF MOBI EPUB
DeveloperContainerized Docker Application Lifecycle with Microsoft Platform and ToolsPDF
DeveloperCreating Mobile Apps with Xamarin.Forms, Preview Edition 2PDF MOBI EPUB
Developer
Abstract
Motivation: Accurate contact predictions can be used for predicting the structure of proteins. Until recently these methods were limited to very big protein families, decreasing their utility. However, recent progress by combining direct coupling analysis with machine learning methods has made it possible to predict accurate contact maps for smaller families. To what extent these predictions can be used to produce accurate models of the families is not known.Results: We present the PconsFold2 pipeline that uses contact predictions from PconsC3, the CONFOLD folding algorithm and model quality estimations to predict the structure of a protein. We show that the model quality estimation significantly increases the number of models that reliably can be identified. Finally, we apply PconsFold2 to 6379 Pfam families of unknown structure and find that PconsFold2 can, with an estimated 90% specificity, predict the structure of up to 558 Pfam families of unknown structure. Out of these, 415 have not been reported before.Availability and Implementation: Datasets as well as models of all the 558 Pfam families are available at http://c3.pcons.net/. All programs used here are freely available.Contact:arne@bioinfo.se
          When loss-of-function is loss of function: assessing mutational signatures and impact of loss-of-function genetic variants        
Abstract
Motivation: Loss-of-function genetic variants are frequently associated with severe clinical phenotypes, yet many are present in the genomes of healthy individuals. The available methods to assess the impact of these variants rely primarily upon evolutionary conservation with little to no consideration of the structural and functional implications for the protein. They further do not provide information to the user regarding specific molecular alterations potentially causative of disease.Results: To address this, we investigate protein features underlying loss-of-function genetic variation and develop a machine learning method, MutPred-LOF, for the discrimination of pathogenic and tolerated variants that can also generate hypotheses on specific molecular events disrupted by the variant. We investigate a large set of human variants derived from the Human Gene Mutation Database, ClinVar and the Exome Aggregation Consortium. Our prediction method shows an area under the Receiver Operating Characteristic curve of 0.85 for all loss-of-function variants and 0.75 for proteins in which both pathogenic and neutral variants have been observed. We applied MutPred-LOF to a set of 1142 de novo vari3ants from neurodevelopmental disorders and find enrichment of pathogenic variants in affected individuals. Overall, our results highlight the potential of computational tools to elucidate causal mechanisms underlying loss of protein function in loss-of-function variants.Availability and Implementation:http://mutpred.mutdb.orgContact:predrag@indiana.edu

          Dataism: Getting out of the 'job loop' and into the 'knowledge loop'        
From deities to data - "For thousands of years humans believed that authority came from the gods. Then, during the modern era, humanism gradually shifted authority from deities to people... Now, a fresh shift is taking place. Just as divine authority was legitimised by religious mythologies, and human authority was legitimised by humanist ideologies, so high-tech gurus and Silicon Valley prophets are creating a new universal narrative that legitimises the authority of algorithms and Big Data." Privileging the right of information to circulate freely - "There's an emerging market called Dataism, which venerates neither gods nor man - it worships data. From a Dataist perspective, we may interpret the entire human species as a single data-processing system, with individual humans serving as its chips. If so, we can also understand the whole of history as a process of improving the efficiency of this system... Like capitalism, Dataism too began as a neutral scientific theory, but is now mutating into a religion that claims to determine right and wrong... Just as capitalists believe that all good things depend on economic growth, so Dataists believe all good things - including economic growth - depend on the freedom of information." Our unparalleled ability to control the world around us is turning us into something new - "We have achieved these triumphs by building ever more complex networks that treat human beings as units of information. Evolutionary science teaches us that, in one sense, we are nothing but data-processing machines: we too are algorithms. By manipulating the data we can exercise mastery over our fate." Planet of the apps - "Many of the themes of his first book are reprised: the importance of the cognitive revolution and the power of collaboration in speeding the ascent of Man; the essential power of myths — such as religion and money — in sustaining our civilisations; and the inexcusable brutality with which our species treats other animals. But having run out of history to write about, Harari is forced to turn his face to the future... 'Forget economic growth, social reforms and political revolutions: in order to raise global happiness levels, we need to manipulate human biochemistry'... For the moment, the rise of populism, the rickety architecture of the European Union, the turmoil in the Middle East and the competing claims on the South China Sea will consume most politicians' attention. But at some time soon, our societies will collectively need to learn far more about these fast-developing technologies and think far more deeply about their potential use." also btw...
  • Preparing for our Posthuman Future of Artificial Intelligence - "By exploring the recent books on the dilemmas of AI and Human Augmentation, how can we better prepare for (and understand) the posthuman future? By David Brin." (omni o)
  • The Man-Machine Myth - "Beliefs inspired by the cybernetic mythos have a quasi-theological character: They tend to be faith-based."
  • Unsettling thought of the day
  • Each technological age seems to have a "natural" system of government that's the most stable and common... Anyway, now we've entered a new technological age: the information age. What is the "natural" system of government for this age?

    An increasing number of countries now seem to be opting for a new sort of illiberal government - the style of Putin and the CCP. This new thing - call it Putinism - combines capitalism, a "deep state" of government surveillance, and social/cultural fragmentation.

    It's obviously way too early to tell, but there's an argument to be made that Putinism is the natural system of government now. New technology fragments the media, causing people to rally to sub-national identity groups instead of to the nation-state.

    The Putinist "deep state" commands the heights of power with universal surveillance, and allies with some rent-collecting corporations. Meanwhile, IF automation decreases labor's share of income and makes infantry obsolete, the worker/soldier class becomes less valuable.

    "People power" becomes weak because governments can suppress any rebellion with drones, surveillance, and other expensive weaponry. Workers can strike, but - huge hypothetical assumption alert! - they'll just be replaced, their bargaining power low due to automation.

    In sum: Powerful authoritarian governments, fragmented society, capitalism, "Hybrid warfare", and far less liberty.
  • The Totalitarian - "Putinist models seem to curtail personal freedom and self-expression. Chases away innovation class. In the long run this makes them unable to keep up with more innovative, open societies. But innovative open societies are also fissiparous in the long run. They need a strong centralized, even authoritarian, core. To wit the big democracies also have deep states, just ones that infringe on domestic public life less than Putinist do. Automation makes mass citizenry superfluous as soldiers, workers or taxpayers. The insiders' club is ever-shrinking. Steady state of AI era is grim. One demigod and 10 billion corpses/brain-in-jars depending on humanism quotient of the one. The three pillars for this end state are strong AI, mind uploading/replication, and mature molecular nanotechnology."
  • Capitalism and Democracy: The Strain Is Showing - "Confidence in an enduring marriage between liberal democracy and global capitalism seems unwarranted."
  • So what might take its place? One possibility[:] ... a global plutocracy and so in effect the end of national democracies. As in the Roman empire, the forms of republics might endure but the reality would be gone.

    An opposite alternative would be the rise of illiberal democracies or outright plebiscitary dictatorships... [like] Russia and Turkey. Controlled national capitalism would then replace global capitalism. Something rather like that happened in the 1930s. It is not hard to identify western politicians who would love to go in exactly this direction.

    Meanwhile, those of us who wish to preserve both liberal democracy and global capitalism must confront serious questions. One is whether it makes sense to promote further international agreements that tightly constrain national regulatory discretion in the interests of existing corporations... Above all... economic policy must be orientated towards promoting the interests of the many not the few; in the first place would be the citizenry, to whom the politicians are accountable. If we fail to do this, the basis of our political order seems likely to founder. That would be good for no one. The marriage of liberal democracy with capitalism needs some nurturing. It must not be taken for granted.
  • G20 takes up global inequality challenge - "Even before the final communiqué is drafted for the annual G20 summit the leaders of the world's largest economies already seemed to agree on their most pressing priority: to find a way to sell the benefits of globalisation to an increasingly sceptical public. As they arrived in the Chinese city of Hangzhou over the weekend, many were on the defensive amid a welter of familiar complaints back home: frustratingly slow growth, rising social inequality and the scourge of corporate tax avoidance."
  • "Growth drivers from the previous round of technological progress are fading while a new technological and industrial revolution has yet to gain momentum," Mr Xi said at the start of the G20, adding that the global economy was at a "critical juncture".

    "Here at the G20 we will continue to pursue an agenda of inclusive and sustainable growth," Mr Obama said, acknowledging that "the international order is under strain".

    Mr Xi, whose country has arguably benefited more than any other from globalisation, struck a similarly cautious note in a weekend speech to business leaders. In China, he said, "we will make the pie bigger and make sure people get a fairer share of it".

    He also recognised global inequity, noting that the global gini coefficient — the standard measure of inequality — had raced past what he called its "alarm level" of 0.6 and now stood at 0.7. "We need to build a more inclusive world economy," Mr Xi said.
  • G20 leaders urged to 'civilise capitalism' - "Chinese president Xi Jinping helped set the tone of this year's G20 meeting in a weekend address to business executives. 'Development is for the people, it should be pursued by the people and its outcomes should be shared by the people', Mr Xi said... Before the two-day meeting, the US government argued that a 'public bandwagon' was growing to ditch austerity in favour of fiscal policy support. 'Maybe the Germans are not absolutely cheering for it but there is a growing awareness that 'fiscal space' has to be used to a much greater extent', agreed Ángel Gurría, secretary-general of the Organisation for Economic Cooperation and Development."
  • Martin Wolf calls for basic income, land taxation & intellectual property reform: Enslave the robots and free the poor
  • The rise of intelligent machines is a moment in history. It will change many things, including our economy. But their potential is clear: they will make it possible for human beings to live far better lives. Whether they end up doing so depends on how the gains are produced and distributed. It is possible that the ultimate result will be a tiny minority of huge winners and a vast number of losers. But such an outcome would be a choice not a destiny. A form of techno-feudalism is unnecessary. Above all, technology itself does not dictate the outcomes. Economic and political institutions do. If the ones we have do not give the results we want, we must change them.
  • From the Job Loop to the Knowledge Loop (via Universal Basic Income) - "We work so we can buy stuff. The more we work, the more we can buy. And the more is available to buy, the more of an incentive there is to work. We have been led to believe that one cannot exist without the other. At the macro level we are obsessed with growth (or lack thereof) in consumption and employment. At the individual level we spend the bulk of our time awake working and much of the rest of it consuming."
  • I see it differently. The real lack of imagination is to think that we must be stuck in the job loop simply because we have been in it for a century and a half. This is to confuse the existing system with humanity's purpose.

    Labor is not what humans are here for. Instead of the job loop we should be spending more of our time and attention in the knowledge loop [learn->create->share]... if we do not continue to generate knowledge we will all suffer a fate similar to previous human societies that have gone nearly extinct, such as the Easter Islanders. There are tremendous threats, eg climate change and infectious disease, and opportunities, eg machine learning and individualized medicine, ahead of us. Generating more knowledge is how we defend against the threats and seize the opportunities.
  • What's more scarce: money, or attention? - "Attention is now the scarce resource."

          Computer Programming Algorithms Directory        
Encryption Algorithms
  • Advanced Encryption Standard (AES), Data Encryption Standard (DES), Triple-DES and Skipjack Algorithms - Offers descriptions of the named encryption algorithms.
  • Blowfish - Describes the Blowfish encryption algorithm. Offers source code for a variety of platforms.
  • KremlinEncrypt - Cryptography site provides an overview of cryptography algorithms and links to published descriptions where available.
  • PowerBASIC Crypto Archives - Offers PowerBASIC source code for many algorithms including:
    Hashing - RIPEMD-160, MD5, SHA-1, SHA-256, CRC-16, CRC-32, Adler-32, FNV-32, ELF-32
    Encryption - RSA-64, Diffie-Hellman-Merkle Secure Key Exchange, Rijndael, Serpent, Twofish, CAST-128, CAST-256, Skipjack, TEA, RC4, PC1, GOST, Blowfish, Caesar Substitutional Shift, ROT13
    Encoding - Base64, MIME Base64, UUEncode, yEnc, Neuronal Network, URLEncode, URLDecode
    Compression - LZ78, LZSS, LZW, RLE, Huffman, Supertiny
    Psuedo-Random Number Generation (PRNG) - Mersenne Twister Number Generator, Cryptographic PRNG, MPRNG, MOAPRNG, L'Ecuyer LCG3 Composite PRNG, W32.SQL-Slammer
  • TEA - Tiny Encryption Algorithm - Describes the TEA encryption algorithm with C source code.
  • xICE - Has links towards the bottom of the page to the description of the xice encryption algorithm as well as the xice software development kit which contains the algorithm's full source code in C++, ASP, JScript, Ruby, and Visual Basic 6.0.

Genetic Algorithms

  • Artificial Life - Offers executable and source for ant food collection and the travelling salesman problems using genetic algorithms
  • Genetic Ant Algorithm - Source code for a Java applet that implements the Genetic Ant Algorithm based upon the model given in Koza, Genetic Programming, MIT Press
  • Introduction to Genetic Algorithms - Introduces fundamentals, offers Java applet examples
  • Jaga - Offers a free, open source API for implementing genetic algorithms (GA) and genetic programming (GP) applications in Java
  • SPHINcsX - Describes a methodology to perform a generalized zeroth-order two- and three-dimensional shape optimization utilizing a genetic algorithm

GIS (Geographic Information Systems) Algorithms

  • Efficient Triangulation Algorithm Suitable for Terrain Modelling - Describes algorithm and includes links to source code for various languages
  • Prediction of Error and Complexity in GIS Algorithms - Describes algorithms for GIS sensitivitiy analysis
  • Point in Polygon Algorithm - Describes algorithm

Sorting Algorithms

  • Andrew Kitchen's Sorting Algorithms - Describes parallel sorting algorithms:
    Odd-Even Transposition Sort has a worst case time of O(n), running on n processors. Its absolute speed up is O(log n), so its efficiency is O((log n)/n)
    Shear Sort has a worst case time of O(n½ log n), running on n processors. Its absolute speed up is O(n½), so its efficiency is O(1/n½)
  • Ariel Faigon's Library of Sorting Algorithms - C source code for a variety of sorting algorithms including Insertion Sort, Quick Sort, Shell Sort, Gamasort, Heap Sort and Sedgesort (Robert Sedgewick quicksort optimization)
  • Flash Sort - Describes the FlashSort algorithm which sorts n elements in O(n) time
  • Michael Lamont's Sorting Algorithms - Describes common sorting algorithms:
    O(n²) Sorts - bubble, insertion, selection and shell sorts
    O(n log n) Sorts - heap, merge and quick sorts
  • Sequential and Parallel Sorting Algorithms - Describes many sorting algorithms:
    Quicksort
    Heapsort
    Shellsort
    Mergesort
    Sorting Networks
    Bitonic Sort
    Odd-Even Mergesort
    LS3-Sort
    4-way Mergesort
    Rotate Sort
    3n-Sort
    s^2-way Mergesort

Search Algorithms

  • Exact String Matching Algorithms - Details 35 exact string search algorithms.
  • Finding a Loop in a Singly Linked List - Outlines several methods for identifying loops in a singly linked list.
  • Fibonaccian search - Describes an O(log n) search algorithm for sorted arrays that is faster than a binary search for very large arrays.

Tree Algorithms

  • B-Trees: Balanced Tree Data Structures - Introduction to B-Trees. Describes searching, splitting and inserting algorithms.

Computational Geometry Algorithms

  • CGAL - Offers open source C++ library of computational geometry algorithms.
  • FastGEO - Offers source code for a library of computational geometry algorithms such as geometrical primitives and predicates, hull construction, triangulation, clipping, rotations and projections using the Object Pascal language.
  • Wykobi - FastGEO library ported to C++.

Phonetic Algorithms

  • Lawrence Philips' Metaphone Algorithm - Describes an algorithm which returns the rough approximation of how an English word sounds. Offers a variety of source code listings for the algorithm.
  • Soundex Algorithms - Describes the NYSIIS VS Soundex and R. C. Russell's soundex algorithms.

Project Management Algorithms

  • Calculations for Critical Path Scheduling - Describes the algorithms for calculating critical paths with both ADM and PDM networks.
  • Resource Leveling Using the Minimum Moment Heuristic - Offers a Windows 3.1 download that includes a .PDF document describing the algorithm.
  • Resource-Constrained Project Scheduling - (.PDF) Describes several algorithms for resource leveling:
    Basic Single Mode RCPSP
    Basic Multi-Mode RCPSP
    Stochastic RCPSP
    Bin Packing related RCPSP
    Multi-resource constrained project scheduling problem (MRCPSP)

Miscellaneous Algorithms

  • AI Horizon - Has a variety of algorithms, from basic computer science data structures such as 2-3 trees to AI-related algorithms such as minimax and a discussion of machine learning algorithms.
  • Hash Algorithms - Overview and source code (in C, Pascal and Java) for many general purpose hashing algorithms.
  • Porter Stemming Algorithm - Describes a process for removing the commoner morphological and inflexional endings from words in English. Its main use is as part of a term normalisation process that is usually done when setting up Information Retrieval systems.
  • Rubik's Cube - Solves a Rubic's Cube using the BestFast search algorithm and profile tables.
  • Simulated Annealing - The fundamental idea is to allow moves resulting in solutions of worse quality than the current solution (uphill moves) in order to escape from local minima. The probability of doing such a move is decreased during the search.
  • The Stony Brook Algorithm Repository - Offers a collection of algorithm implementations for over seventy of the most fundamental problems in combinatorial algorithms:
  1. Data Structures - Dictionaries, Priority Queues, Suffix Trees and Arrays, Graph Data Structures, Set Data Structures, Kd-Trees
  2. Numerical Problems - Solving Linear Equations, Bandwidth Reduction, Matrix Multiplication, Determinants and Permanents, Linear Programming/Simplex Method, Random Number Generation, Factoring and Primality Testing, Arbitrary Precision Arithmetic, Knapsack Problem, Discrete Fourier Transform
  3. Combinatorial Problems - Sorting, Searching, Median and Selection, Permutations, Subsets, Partitions, Graphs, Calendrical Calculations, Job Scheduling, Satisfiability
  4. Graph Problems - Polynomial Time Problems (Connected Components, Topological Sorting, Minimum Spanning Tree, Shortest Path, Transitive Closure and Reduction, Matching, Eulerian Cycle / Chinese Postman, Edge and Vertex Connectivity, Network Flow, Drawing Graphs Nicely, Drawing Trees, Planarity Detection and Embedding)
  5. Graph Problems - Hard Problems (Clique, Independent Set, Vertex Cover, Traveling Salesman Problem, Hamiltonian Cycle, Graph Partition, Vertex Coloring, Edge Coloring, Graph Isomorphism, Steiner Tree, Feedback Edge/Vertex Set
  6. Computational Geometry - Robust Geometric Primitives, Convex Hull, Triangulation, Voronoi Diagrams, Nearest Neighbor Search, Range Search, Point Location, Intersection Detection, Bin Packing, Medial-Axis Transformation, Polygon Partitioning, Simplifying Polygons, Shape Similarity, Motion Planning, Maintaining Line Arrangements, Minkowski Sum
  7. Set and String Problems - Set Cover, Set Packing, String Matching, Approximate String Matching, Text Compression, Cryptography, Finite State Machine Minimization, Longest Common Substring, Shortest Common Superstring
  • NIST Dictionary of Algorithms and Data Structures - Some entries have links to implementations.

Click to Read More


          How Future Technology Can Improve Human Brain Capabilities        

Kernel – a new company, with an ambitious goal: to hack human intelligence (HI) and transform our brain into a programmable device that can be improved. By developing neuroprosthetics to bridge the gap between humankind and its tools, Kernel is already hard work building the solutions to marry human and artificial intelligence. Kernell marries neuroscience with technological advancements in machine learning, microelectronics, and translational medicine — making it possible to enhance intelligence in ways we previously didn’t think possible, thus redefining the human brain. Neuroscience Advancements in neuroscience have given us a deeper understanding of the brain, and furnished us with tools to reengineer it. Machine Learning Increasingly capable of matching complementing and exceeding human abilities will make connecting HI and AI even more seamless. Microelectroics Miniaturized electronics are making brain-implant technology increasingly powerful and less invasive. Translational Neuromedicine Advances in neuromedicine allow us to diagnose the mechanical causes of neurological ailments, and to construct new devices to successfully treat them. The symbiosis of human in artificial intelligence will unlock previously unimagined possibilities. Neuroprosthetics Sophisticated …

The post How Future Technology Can Improve Human Brain Capabilities appeared first on Listographer.


          Ensemble Models in Machine Learning        
Ensemble models have been around for a while. In essence co-operative ensemble models take many independent predictive models and combine their results for greater accuracy and generalisation. Think "the wisdoms of crowds" applied to predictive models. For example, the well known Netflix prize, where competitors had to advance the existing RMSE scores on a movie recommendation [...]
          Size Matters: Empirical Evidence of the Importance of Training Set Size in Machine Learning        
There is much hype around "big data" these days - how it's going to change the world - which is causing data scientists to get excited about big data analytics, and technologists to scramble to understand how they employ scalable, distributed databases and compute clusters to store and process all this data. Interestingly, Gartner dropped "big [...]
          Imanis Data Launches Industry’s Fastest Backup and Recovery Software for Modern Data Platforms        
Leveraging the Power of Machine Learning, Imanis Data 3.0 Enables Early Detection of Ransomware Attacks and Data Loss, Minimizing Downtime SAN JOSE, CA – Aug. 09, 2017 — /BackupReview.info/ — Imanis Data, formerly Talena, creator of the first and only machine learning-backed data management software for modern data platforms, today announced the availability of the [...] Related posts:
  1. Talena Enables Rapid Migration of Modern Data Workloads to Microsoft Azure HDInsight
  2. IBM Expands Portfolio of Cloud Business Solutions with the Launch of Industry Platforms
  3. Acronis Backup Cloud Unlocks Easy Revenue for Microsoft Office 365 and Azure Data Protection with Industry’s Fastest, Most Complete Solution
  4. KineticD Online Data Backup Expert Tips: Modern Day Disaster Recovery for SMBs
  5. New Kaspersky Endpoint Security for Business Delivers Enhanced Data Protection and Manageability Across All Platforms and Devices

          Predicting the Past: Digital Art History, Modeling, and Machine Learning        
We are surrounded by models. When you check the weather forecast in the morning before going to work, you are seeing the result of a model of your local atmosphere. This model is a set of rules and da ... - Source: blogs.getty.edu
          Computer Reads Body Language        

Image of two robotics institute researchers showing how gestures are detected

Researchers at Carnegie Mellon University's Robotics Institute have enabled a computer to understand body poses and movements of multiple people from video in real time — including, for the first time, the pose of each individual's hands and fingers.

This new method was developed with the help of the Panoptic Studio — a two-story dome embedded with 500 video cameras — and the insights gained from experiments in that facility now make it possible to detect the pose of a group of people using a single camera and a laptop computer.

Yaser Sheikh, associate professor of robotics, said these methods for tracking 2-D human form and motion open up new ways for people and machines to interact with each other and for people to use machines to better understand the world around them. The ability to recognize hand poses, for instance, will make it possible for people to interact with computers in new and more natural ways, such as communicating with computers simply by pointing at things.

Detecting the nuances of nonverbal communication between individuals will allow robots to serve in social spaces, allowing robots to perceive what people around them are doing, what moods they are in and whether they can be interrupted. A self-driving car could get an early warning that a pedestrian is about to step into the street by monitoring body language. Enabling machines to understand human behavior also could enable new approaches to behavioral diagnosis and rehabilitation, for conditions such as autism, dyslexia and depression.

"We communicate almost as much with the movement of our bodies as we do with our voice," Sheikh said. "But computers are more or less blind to it."

In sports analytics, real-time pose detection will make it possible for computers to track not only the position of each player on the field of play, as is now the case, but to know what players are doing with their arms, legs and heads at each point in time. The methods can be used for live events or applied to existing videos.

To encourage more research and applications, the researchers have released their computer code for both multi-person and hand pose estimation. It is being widely used by research groups, and more than 20 commercial groups, including automotive companies, have expressed interest in licensing the technology, Sheikh said.

Sheikh and his colleagues will present reports on their multi-person and hand pose detection methods at CVPR 2017, the Computer Vision and Pattern Recognition Conference July 21-26 in Honolulu.

Tracking multiple people in real time, particularly in social situations where they may be in contact with each other, presents a number of challenges. Simply using programs that track the pose of an individual does not work well when applied to each individual in a group, particularly when that group gets large. Sheikh and his colleagues took a "bottom-up" approach, which first localizes all the body parts in a scene — arms, legs, faces, etc. — and then associates those parts with particular individuals.

The challenges for hand detection are greater. As people use their hands to hold objects and make gestures, a camera is unlikely to see all parts of the hand at the same time. Unlike the face and body, large datasets do not exist of hand images that have been annotated with labels of parts and positions.

But for every image that shows only part of the hand, there often exists another image from a different angle with a full or complementary view of the hand, said Hanbyul Joo, a Ph.D. student in robotics. That's where the researchers were able to make use of CMU's multi-camera Panoptic Studio.

"A single shot gives you 500 views of a person's hand, plus it automatically annotates the hand position," Joo said. "Hands are too small to be annotated by most of our cameras, however, for this study we used just 31 high-definition cameras, but still were able to build a massive data set."

Joo and fellow Ph.D. student Tomas Simon used their hands to generate thousands of views.

"The Panoptic Studio supercharges our research," Sheikh said. It now is being used to improve body, face and hand detectors by jointly training them. Also, as work progresses to move from the 2-D models of humans to 3-D models, the facility's ability to automatically generate annotated images will be crucial, he said.

When the Panoptic Studio was built a decade ago with support from the National Science Foundation, it was not clear what impact it would have, Sheikh said.

"Now, we're able to break through a number of technical barriers primarily as a result of that NSF grant 10 years ago," he said. In addition to sharing the code, we're also sharing all the data captured in the Panoptic Studio."

In addition to Sheikh, the multi-person pose estimation research included Simon and master's degree students Zhe Cao and Shih-En Wei. The hand detection study included Sheikh, Joo, Simon and Iain Matthews, an adjunct faculty member in the Robotics Institute. Gines Hidalgo Martinez, a master's degree student, collaborates on this work, managing the source code.

The CMU AI initiative in the School of Computer Science is advancing artificial intelligence research and education by leveraging the school's strengths in computer vision, machine learning, robotics, natural language processing and human-computer interaction.


          Carnegie Mellon Solidifies Leadership Role in Artificial Intelligence        

Image of Rachel Holladay

Carnegie Mellon University's School of Computer Science (SCS) has launched a new initiative, CMU AI, that marshals work in artificial intelligence (AI) across the school's departments and disciplines, creating one of the largest and most experienced AI research groups in the world.

"For AI to reach greater levels of sophistication, experts in each aspect of AI, such as how computers understand the way people talk or how computers can learn and improve with experience, will increasingly need to work in close collaboration," said SCS Dean Andrew Moore. "CMU AI provides a framework for our ongoing AI research and education."

Image of the CMU AI logo

From self-driving cars to smart homes, AI is poised to change the way people live, work and learn.

"AI is no longer something that a lone genius invents in the garage," Moore added. "It requires a team of people, each of whom brings a special expertise or perspective. CMU researchers have always excelled at collaboration across disciplines, and CMU AI will enable all of us to work together in unprecedented ways."

CMU AI harnesses more than 100 faculty members involved in AI research and education across SCS's seven departments. Moore is directing the initiative with Jaime Carbonell, the Newell University Professor of Computer Science and director of the Language Technologies Institute; Martial Hebert, director of the Robotics Institute; Computer Science Professor Tuomas Sandholm; and Manuela Veloso, the Herbert A. Simon University Professor of Computer Science and head of the Machine Learning Department.

Carnegie Mellon has been on the forefront of AI since creating the first AI computer program, Logic Theorist, in 1956. It created the first and only machine learning department, studying how software can make discoveries and learn with experience. CMU scientists pioneered research into how machines can understand and translate human languages, and how computers and humans can interact with each other. Carnegie Mellon's Robotics Institute has been a leader in enabling machines to perceive, decide and act in the world, including a renowned computer vision group that explores how computers can understand images.

That expertise, spread across several departments, has enabled CMU to develop such technologies as self-driving cars; question-answering systems, including components of IBM's Jeopardy-playing Watson; world-champion robot soccer players; 3-D sports replay technology; and even an AI smart enough to beat four of the world's top poker players.

"AI is a broad field that involves extremely disparate disciplines, from optimization and symbolic reasoning to understanding physical systems," Hebert said. "It's difficult to have state-of-the art expertise in all of those aspects in one place. CMU AI delivers that and makes it centrally accessible."

Recent developments in computer hardware and software make it possible to reunite elements of AI that have grown independently and create powerful new AI technologies. These developments have created incredible demand from industry for computer scientists with AI know-how.

"Students who study AI at CMU have an opportunity to work on projects that unite multiple disciplines - to study AI in its depth and multidisciplinary, integrative aspects. They generally leave CMU for positions of great leadership, and they lead global AI efforts both in terms of starting new ventures and joining innovative companies that tremendously value our education and research," Veloso said. "CMU students at all levels have a big impact on what AI is doing for society."

Nearly 1,000 CMU students are involved in AI research and education. CMU also is vigorously engaged in outreach programs that introduce students in elementary and high school to AI topics and encourage their skills in the topic.

"We're teaching and engaging with those who will improve lives through technology, and who have taken responsibility for what happens in the rest of the century," Moore said. "Exposing these hugely talented human beings to the best AI resources and researchers is imperative for creating the technologies that will advance mankind. This is the first of many steps CMU will take to ensure AI is accessible to all."

CMU AI will focus on educating a new breed of AI scientist and on creating new AI capabilities, from smartphone assistants that learn about users by making friends with them to video technologies that can alter characters to appear older, younger or even as a different actor.

"CMU has a rich history of thought leadership in every aspect of artificial intelligence. Now is exactly the right time to bring this all together for an AI strategy to benefit the world," Moore said.


          CMU Delegation at World Economic Forum in China        

Image of the World Economic Forum sign in Dalian China

By Heidi Opdyke

Illah Nourkbahsh in 2015
Robotics Professor Illah Nourbakhsh leads a discussion on Asia’s Industrialization using visualizations created by his CREATE Lab from Landsat imagery in 2015 at the World Economic Forum's Annual Meeting of the New Champions.

Carnegie Mellon University researchers and scientists will play an important role in global discussions at the World Economic Forum's Annual Meeting of the New Champions, June 27-29, in Dalian, China.

Often called "Summer Davos," to differentiate it from the forum's annual winter meeting in Switzerland, the meeting brings together world leaders in business science, technology, innovation and politics. This year's theme is "Achieving Inclusive Growth in the Fourth Industrial Revolution."

CMU experts have since 2011 led conversations at the World Economic Forum in fields ranging from robotics to artificial intelligence. CMU scientists often lead discussions, give talks, demonstrate technology and provide their distinctive expertise.

This year's CMU delegation includes:

  • Erica Fuchs, professor of engineering and public policy;
  • Madeline Gannon, a research fellow with the Frank-Ratchye STUDIO for Creative Inquiry;
  • James McCann, assistant professor in the Robotics Institute;
  • Tom Mitchell, the E. Fredkin University Professor in the Machine Learning Department;
  • Illah Nourbakhsh, professor of robotics; and
  • Gabriel O'Donnell, principal research programmer and analyst in the Robotics Institute.

Image of the World Economic Sign in Dalian China

CMU will host a panel discussion called "The Future of Production with Carnegie Mellon University," in which Fuchs, Gannon and McCann will discuss rethinking behavior and purpose of industrial robots beyond factory floors, reimagining how large companies can integrate disruption themselves, and reconfiguring how automation collides with human skills.

Nourbakhsh and O'Donnell will make multiple presentations at the Global Situation Space exhibition. The presentations combine NASA time-lapse satellite imagery and geospatial and econometric data with predictive modelling to explore issues such as emerging megacities, man-made changes to the oceans and trade with China.

Nourbakhsh's CREATE Lab and its spinoff BirdBrain Technologies will be part of a workshop on building interactive sculptural robots. He will contribute to sessions on the fourth industrial revolution, the digital economy, the creative economy and platforms for artificial intelligence.

Mitchell will participate in a panel discussion about how the social safety net can respond to the fourth industrial revolution. He recently co-chaired a study of the future work for the National Academies of Sciences, Engineering and Medicine. He will present a session on how big data can affect policymaking.

Madeline Gannon working with a robot
Madelyn Gannon works with industrial robots and is working to invent better ways to communicate with machines.

Gannon was one of 20 researchers selected to the World Economic Forum's Cultural Leaders advisory community. As part of the programming, she will be participating in sessions that discuss the impact of human-centered robotics on the future of work.

Three Named Young Scientists

CMU faculty members Laura Dabbish, an associate professor in the Human-Computer Interaction Institute with a joint appointment in the Heinz College of Information Systems and Public Policy; Louis-Philippe Morency, an assistant professor in the Language Technology Institute; and Tim Verstynen, an assistant professor of psychology, have been named 2017 Young Scientists by the World Economic Forum.

Fifty-two scientists under the age of 40 are recognized this year for exhibiting exceptional creativity, thought leadership and high growth potential, and will be at the Dalian conference.

CMU is one of only 27 universities in the world, 12 in the U.S., that make up the Global University Leaders Forum (GULF), which provides a unique platform for the world's top universities to discuss higher education and research while helping to shape the World Economic Forum agenda. GULF fosters discussion on global policy issues between member universities, the business community and a broad range of stakeholders.


          Carnegie Mellon's RoboTutor Advances to Global Learning XPRIZE Semifinals        

Image of a student working on a tablet

An estimated 250 million children around the world cannot read, write or do fundamental arithmetic, and many of these children are in developing countries without regular access to schools or teachers. XPRIZE is attempting to address this problem by funding an international competition to create open-source Android tablet apps that enable children between the ages of 7 and 10 to learn basic reading, writing and math skills without adult assistance. Apps were created in English and Swahili.

Nearly 200 teams from 40 countries entered the competition. Following an evaluation and pilot test, RoboTutor, led by CMU's Jack Mostow, is one of 11 remaining teams competing for five $1 million finalist prizes.

"RoboTutor is a brilliant piece of educational technology that has already proven to effectively teach English and Swahili-speaking children basic skills. It also perfectly exemplifies our evidence-based approach to carefully integrating technology and using data to continuously refine and improve instruction, leading to better student learning while supporting new discoveries in the learning sciences," said Richard Scheines, dean of the Dietrich College of Humanities and Social Sciences and faculty lead for CMU's Simon Initiative.

RoboTutor's design is based on scientific learning principles to engage students to learn the material and then use it in other contexts. It is powered by advanced technologies, including speech and handwriting recognition, facial analysis and machine learning. It collects data from its interactions with children both to enable cognitive tutors to adapt to individual students and to enable innovative data mining tools to continuously evaluate and refine its design and functionality.

Mostow, emeritus research professor of robotics, machine learning, language technologies and human-computer interaction, has spent the past three decades applying advanced language technologies to literacy.

"I have been able to help a few thousand children over my career, but the Global Learning XPRIZE is the opportunity of a lifetime for me to help millions or even billions of children get a basic education," Mostow said.

RoboTutor leverages many assets, including its precursor, Project LISTEN, which used speech technology to enable natural spoken dialogue with an automated Reading Tutor that listened to children read aloud and helped them learn to read.

RoboTutor's hundreds of activities address four content areas — reading and writing, numbers and math, comprehension and shapes.

Another distinguishing feature is how RoboTutor's data-driven design process integrates with local cultures.

"While we focus on improving RoboTutor with large-scale data, we never forget that it really represents kids who are living and learning in a context that we must understand, in order to properly interpret that data," said Amy Ogan, assistant professor of human-computer interaction, who field-tested RoboTutor in several settings in Tanzania.

In addition to Mostow and Ogan, the RoboTutor team consists of over 100 CMU students and faculty as well as other experts and students from around the globe.

Once the XPRIZE semifinalists have been evaluated, the top five will each receive $1 million, and XPRIZE will conduct an independent 18-month, large-scale study to field-test their Swahili apps, pre- and post-testing 4,000 children in 200 Tanzanian villages on literacy and numeracy. XPRIZE will award the $10 million grand prize to the team whose app achieves the highest learning gains.


          7 técnicas para redução da dimensionalidade        
Publicado originalmente em Data Mining / Machine Learning / Data Analysis:
Na atual era do Big Data em que o custo de armazenamento praticamente foi levado ao nível de commodity, muitas corporações que se gabam que são ‘adeptas’ do Big Data acabam pagando/armazenando ruído ao invés de sinal. Pelo motivo exposto acima, diante do prisma de Engenharia de Dados o…
          Machine Learning Developer - Intel - Toronto, ON        
You require both hardware and software engineering skills, be able to manage complex projects, and be part of a dynamic team working on state-of-the-art...
From Intel - Fri, 30 Jun 2017 10:25:27 GMT - View all Toronto, ON jobs
          Oracle Hospitality Introduces New Data Science Cloud Services to Help Food & Beverage Operators Optimize Every Sale        
Press Release

Oracle Hospitality Introduces New Data Science Cloud Services to Help Food & Beverage Operators Optimize Every Sale

Expert Analysis and Machine Learning Enable Improved Menu Optimization and Greater Cost and Inventory Control

Redwood Shores, Calif.—Aug 1, 2017


Empowering food and beverage operators to convert data into profit, Oracle Hospitality today announced Data Science Cloud Services. With the new services, food and beverage operators gain the ability to analyze key information such as sales, guest, marketing and staff performance data at unprecedented speed – generating insights that lead directly to actionable measures that improve their top and bottom lines.

The suite includes two cloud-driven offerings – Oracle Hospitality Menu Recommendations Cloud Service and Oracle Hospitality Adaptive Forecasts Cloud Service ­­– currently available to operators worldwide, enabling them to improve up-sell and cross-sell opportunities, and optimize operations, respectively.

The new Data Science Cloud Services bring Oracle’s renowned machine learning and data-analytics expertise specifically to the food and beverage industry. This, combined with years of hospitality industry knowledge, delivers quick wins for operators, while saving them the significant expense of having to hire their own analysts and invest in a data processing infrastructure. In addition to Oracle technology, Data Science delivers the support of a team of leading data scientists, database engineers and experienced hospitality consultants. 

“Margins are being squeezed in hospitality like never before,” said Mike Webster, Senior Vice President and General Manager, Oracle Hospitality. “Labor and food costs are increasing, and competition for the dining dollar is high. With our Data Science Cloud Services, we are giving our customers the ability to be as profitable as possible, by helping them pinpoint cost-savings in each location while optimizing every single sales opportunity to deliver revenue growth.”

Making Every Sale Count with Oracle Hospitality Menu Recommendations Cloud Service

Oracle Hospitality Menu Recommendations Cloud Service allows food and beverage operators with multiple locations to evaluate their menus and identify enhancements to maximize every sales opportunity. The Data Science service can seek the best possible up-sell or cross-sell options by location or time of day, with recommendations dynamically updating based on customer behavior. Assumptions around cross-sells and up-sells can be analyzed, leading to better understanding of guest behavior and preferences.  

Speed to value is accelerated, thanks to integration between the Data Science service and the Oracle Hospitality technology platform. Recommendations are available at point-of-service terminals and displayed as localized cross-sells or timed up-sells. Such simplicity enables staff to optimize sales and serve guests without delay or confusion.

Predicting Stock and Labor Needs with Oracle Hospitality Adaptive Forecasts Cloud Service

Oracle Hospitality Adaptive Forecasts lets operators better predict stock and labor needs at every location. The service creates a single forecast by item, location and day part, and factors in weather, events, time of day, day of the week and Net Promoter scores. Such forecasting maintains appropriate levels of inventory and staffing in all business scenarios, helping store managers minimize wasted inventory, lower labor costs and, most importantly, ensure an exceptional guest experience.

For Oracle Hospitality customers, these Advanced Science Cloud Services complement the self-service data access and reporting solutions that are already available, including the InMotion mobile app that provides real-time access to restaurant KPIs and the Reporting and Analytics 9.0 service that was launched in April 2017.


About Oracle Hospitality

Oracle Hospitality brings 35 years of experience in providing technology solutions to food and beverage operators. We provide hardware, software, and services that allow our customers to deliver exceptional guest experiences while maximizing profitability. Our solutions include integrated point-of-sale, loyalty, reporting and analytics, inventory and labor management, all delivered from the cloud to lower IT cost and maximize business agility.

For more information about Oracle Hospitality, please visit www.Oracle.com/Hospitality

About Oracle

The Oracle Cloud offers complete SaaS application suites for ERP, HCM and CX, plus best-in-class database Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) from data centers throughout the Americas, Europe and Asia. For more information about Oracle (NYSE:ORCL), please visit us at www.oracle.com.

Trademarks

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Disclaimer

The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release and timing of any features or functionality described for Oracle's products remains at the sole discretion of Oracle.



          Oracle Named a Leader in 2017 Analyst Evaluation for Digital Process Automation Software        
Press Release

Oracle Named a Leader in 2017 Analyst Evaluation for Digital Process Automation Software

Oracle positioned as a leader and selected for evaluation based on proven customer adoption, strong go-to-market strategy, and breadth of capabilities

Redwood Shores, Calif.—Jul 18, 2017


Oracle today announced it has been named a leader in the Forrester Wave™ Digital Process Automation (DPA) Software report. This placement confirms leadership for Oracle Process Cloud, which is part of Oracle Cloud Platform.

“By delivering comprehensive process automation capabilities such as no- and low-code process design, case management and simplified connections to SaaS, Social, Cloud and on-premise systems, Oracle provides customers with a powerful option to continuously deliver engaging customer, employee, and partner experiences at every stage in their business transformation journey,” said Vikas Anand, vice president, product management, Oracle. “Today business process automation augmented with intelligent machine learning is helping organizations drive best next actions and provide them with better, timely decision making capabilities.”

In Forrester's 30-criteria evaluation of DPA vendors, they evaluated 12 significant software providers. Oracle was cited as a leader with the highest possible scores in the low-code/no-code, smart forms and user experience, process flow and design, mobile engagement, API support, data virtualization, deployment options, and ease of implementation criteria.

Download Forrester’s Wave report for “Digital Process Automation Software, Q3 2017” here.

Oracle Process Cloud Service was built from the ground up for the cloud to provide enterprises of all sizes with the low-code app dev platform that they need to build business agility and control their digital destiny. With full lifecycle support for end-to-end process automation spanning departments, SaaS apps, and on-premises systems of record, Oracle Process Cloud Service empowers business analysts and process designers with the tools they need to rapidly deliver differentiating experiences in a collaborative manner. Oracle Process Cloud Service comes with Quick Start App templates and pre-built integrations to companion platform services including, Content Management, Integration, Mobile, Intelligent Bots, and IoT Apps, to enable rapid delivery of engaging experiences across channels and devices.

Customer Momentum

“Process automation is central to our integration strategy,” said Ravi Gade, senior director of apps IT and digital transformation, Calix. “Calix leverages Oracle Process Cloud to reduce IT backlog, ensure compliance, and simplify connections across our rapidly evolving SaaS and on-premises business systems.”

“Using a combination of Oracle Process Cloud Service and Oracle Application Builder Cloud Service along with the cloud-native best practices introduced by our partner, Rubicon Red, we have a comprehensive, integrated cloud platform that enables us to deliver innovative, modern solutions,” said Ryan Klose, general manager, corporate, National Pharmacies. “The Oracle Cloud Platform gives us flexibility to connect to all our core systems, and easily deliver to a range of user interfaces, whether they be online, mobile/tablet, devices/IoT, or emerging chatbot technology.” 

"Oracle Process Cloud Service has allowed us to dramatically shorten our time-to-market by up to 40 percent,” said Matt Wright, chief technology officer, Rubicon Red. “Oracle provides developers with immediate access to a full lifecycle process management environment—including development, test, and production—and enterprise-quality tooling, without needing to build and maintain an IT infrastructure.”

Additional information:


Contact Info
Nicole Maloney
Oracle
+1.415.235.4033
nicole.maloney@oracle.com
Sarah Fraser
Oracle
+1.650.743.0660
sarah.fraser@oracle.com
About Oracle

The Oracle Cloud offers complete SaaS application suites for ERP, HCM and CX, plus best-in-class database Platform as a Service (PaaS) and Infrastructure as a Service (IaaS) from data centers throughout the Americas, Europe and Asia. For more information about Oracle (NYSE:ORCL), please visit us at www.oracle.com.

Trademarks

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.


Talk to a Press Contact

Nicole Maloney

  • +1.415.235.4033

Sarah Fraser

  • +1.650.743.0660

          Futurology – The Future of Machine Learning        
This week the Futurology team take on the future of machine learning - what does it all mean? Listen to the discussion around the ins and outs of Machine Learning, AI and the impact of all of this progress on jobs, innovation and the future.
          Houston Legal Links 8/9/2017        
Top legal news: Texas House passes bill restricting insurance coverage of abortion; Disciplinary Suit Filed Against Lawyer for Filing Frivolous Legal Mal Case (Texas Lawyer); Religious Groups Run Ad Blitzes Both For And Against “Bathroom Bill”; Fifth Circuit Allows Employer to Force Workers to Sign Class Action Waivers (Texas Lawyer); To Justify Defunding Planned Parenthood, AG Paxton Still Thinks "Sting" Videos Are Legit; Texas Tells 5th Circ. Planned Parenthood Medicaid Cutoff OK (Law360); Commissioners weigh deputy pay raises as Harris County pushes back on salary complaints; Are lawmakers' business ties with public entities a conflict of interest?; Texas restaurant group joins battle against bathroom bill; Houston May Put A $495 Million Bond On The Ballot — So What's In It?; Texas backs Wisconsin in battle to protect partisan gerrymandering; UT, A&M students latest to push against assault inquiry process (Chron subsc); Dallas Bar Judicial Evaluation Poll: No ‘A’s’ but Nine ‘F’s" (Texas Lawbook); Texas Sen. Fights Atty DQ In Frack Sand Fraud Suit (Law360); Two slain men possibly lured by dating app, investigators say (Chron subsc); Houston family says its 5-year-old dog died while in the cargo area of United Airlines flight; Brazoria Commissioners ban swimming in San Luis Pass & Case Against Jockey Who Buzzed Horse At Sam Houston Race Park Still Unresolved. For the water cooler: Biglaw Is Handing Out Fidget Spinners Swag And We’re All Doomed; Law firms pay lip service to diversity, must take ‘concrete steps to change,’ former judge says; Report finds greater flow of ‘big-ticket’ work to the nation’s top law firms; Alleged Lamppost Thief Has Trouble With Getaway; Which Biglaw Firm Is Best For Pro Bono? (2017); Forever 21 Claps Back At Gucci Over Cease-And-Desists Involving Stripes [Updated]; Justice Department changes position, supports Ohio in voter purge case; Axiom adds machine learning to contract review; Trump sends appreciative messages to Mueller; firing him has ‘never been on the table,’ lawyer says; What ‘Game Of Thrones’ Can Teach Lawyers About Trial; ABA, senators ask CMS to rethink mandatory arbitration in nursing home admissions contracts; Federal age bias law is rarely the basis for EEOC lawsuits; Individuals forced to pay hundreds of dollars more to pretrial services firm after bail: lawsuit; Should Congress regulate sexbots? Law prof sees need for hearings & Posner opinion tosses death sentence for man forced to wear stun belt during penalty hearing.
          CDS Partners with Brainspace to Enhance Advanced Analytics Portfolio        

eDiscovery Leader Strengthens Analytics Toolkit with Visual Machine Learning for Investigations, Early Case Assessment, and Technology Assisted Review

(PRWeb August 10, 2017)

Read the full story at http://www.prweb.com/releases/2017/08/prweb14589151.htm


          Data Integration is the Foundation        

Unless you live under a rock, you’ve seen the buzz about Data Lakes, Big Data, Data Mining, Cloud-tech, and Machine Learning. I watch and read reports from two perspectives: technical and as a consultant.

As a Consultant

If you watch CNBC, you won’t hear discussions about ETL Incremental Load or Slowly Changing Dimensions Design Patterns. You will hear them using words like “cloud” and “big data,” though. That means people who watch and respect the people on CNBC are going to hire consultants who are knowledgeable about cloud technology and Big Data.

As an Engineer

I started working with computers in 1975. Since that time, I believe I’ve witnessed about one major paradigm shift per decade. I believe I am now witnessing two at the same time: 1) A revolution in Machine Learning and all the things it touches (which includes Big Data and Data Lakes); and 2) the Cloud. These two are combining in some very interesting ways. Data Lakes and Big Data appliances and systems are the sources for many systems, Machine Learning and Data Mining solutions are but a couple of their consumers. At the same time, much of this technology and storage is either migrating to the Cloud, or is being built there (and in some cases, only there). But all of this awesome technology depends on something…

Data

In order for Machine Learning or Data Mining to work, there has to be data in the Data Lake or in the Big Data appliance or system. Without data, the Data Lake is dry. Without data, there’s no “Big” in Big Data. How do these solutions acquire data?

It Depends

Some of these new systems have access to data locally. But many of them – most, if I may be so bold – require data to be rounded up from myriad sources. Hence my claim that data integration is the foundation for these new solutions.

What is Data Integration and Why is it Important?

Data integration is the collection of data from myriad, disparate sources into a single (or minimal number of) repository (repositories). It’s “shipping” the data from where it is to someplace “nearer.” Why is this important? Internet connection speeds are awesome these days. I have – literally – 20,000 times more bandwidth than when I first connected to the internet. But modern internet connection speeds are hundreds-to-millions times slower than networks running inside data centers. Computing power – measured in cycles or flops per second – is certainly required to perform today’s magic with Machine Learning. But if the servers must wait hours (or longer) for data – instead of milliseconds? The magic happens in slow-motion. In slow-motion, magic doesn’t look awesome at all.

Trust me, speed matters.

Data integration is the foundation on which most of these systems depend. Some important questions to consider:

  • Are you getting the most out of your enterprise data integration?
  • Could your enterprise benefit from faster access to data – perhaps even near real-time business intelligence?
  • How can you improve your enterprise data integration solutions?

:{>

Learn more:

Enterprise Data & Analytics
Stairway to Integration Services
IESSIS1: Immersion Event on Learning SQL Server Integration Services
EnterpriseDNA Training


          Get ready for the post-cloud world        
Just when cloud computing seems inevitable as the dominant force in IT, it’s time to move on because we’re not quite at the end-state of digital transformation. Far from it.

Now's the time to prepare for the post-cloud world.

It’s not that cloud computing is going away. It’s that we need to be ready for making the best of IT productivity once cloud in its many forms become so pervasive as to be mundane, the place where all great IT innovations must go.




You may also be interested in:


          India Smart Cities Mission shows IoT potential for improving quality of life at vast scale        
The next BriefingsDirect Voice of the Customer Internet-of-Things (IoT) transformation discussion examines the potential impact and improvement of low-power edge computing benefits on rapidly modernizing cities.

These so-called smart city initiatives are exploiting open, wide area networking (WAN) technologies to make urban life richer in services, safer, and far more responsive to residences’ needs. We will now learn how such pervasively connected and data-driven IoT architectures are helping cities in India vastly improve the quality of life there.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to share how communication service providers have become agents of digital urban transformation are VS Shridhar, Senior Vice President and Head of the Internet-of-Things Business Unit at Tata Communications in Chennai area, India, and Nigel Upton, General Manager of the Universal IoT Platform and Global Connectivity Platform and Communications Solutions Business at Hewlett Packard Enterprise (HPE). The discussion is moderated by Dana Gardner, principal analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about India’s Smart Cities mission. What are you up to and how are these new technologies coming to bear on improving urban quality of life?

Shridhar: The government is clearly focusing on Smart Cities as part of their urbanization plan, as they believe Smart Cities will not only improve the quality of living, but also generate employment, and take the whole country forward in terms of technologically embracing and improving the quality of life.

So with that in mind, the Government of India has launched 100 Smart Cities initiatives. It’s quite interesting because each of the cities that aspire to belong had to make a plan and their own strategy around how they are going to evolve and how they are going to execute it, present it, and get selected. There was a proper selection process.

Many of the cities made it, and of course some of them didn’t make it. Interestingly, some of the cities that didn’t make it are developing their own plans.
IoT Solutions for Communications Service Providers and Enterprises from HPE
Learn More
There is lot of excitement and curiosity as well as action in the Smart Cities project. Admittedly, it’s a slow process, it’s not something that you can do at the blink of the eye, and Rome wasn’t built overnight, but I definitely see a lot of progress.

Gardner:Nigel, it seems that the timing for this is auspicious, given that there are some foundational technologies that are now available at very low cost compared to the past, and that have much more of a pervasive opportunity to gather information and make a two-way street, if you will, between the edge and central administration. How is the technology evolution synching up with these Smart Cities initiatives in India?

Upton:I am not sure whether it’s timing or luck, or whatever it happens to be, but adoption of the digitization of city infrastructure and services is to some extent driven by economics. While I like to tease my colleagues in India about their sensitivity to price, the truth of the matter is that the economics of digitization -- and therefore IoT in smart cities -- needs to be at the right price, depending on where it is in the world, and India has some very specific price points to hit. That will drive the rate of adoption.

And so, we're very encouraged that innovation is continuing to drive price points down to the point that mass adoption can then be taken up, and the benefits realized to a much more broad spectrum of the population. Working with Tata Communications has really helped HPE understand this and continue to evolve as technology and be part of the partner ecosystem because it does take a village to raise an IoT smart city. You need a lot of partners to make this happen, and that combination of partnership, willingness to work together and driving the economic price points to the point of adoption has been absolutely critical in getting us to where we are today.

Balanced Bandwidth

Gardner:Shridhar, we have some very important optimization opportunities around things like street lighting, waste removal, public safety, water quality; of course, the pervasive need for traffic and parking, monitoring and improvement.

How do things like a low-power specification Internet and network gateways and low-power WANs (LPWANs) create a new foundation technically to improve these services? How do we connect the services and the technology for an improved outcome?

Shridhar:If you look at human interaction to the Internet, we have a lot of technology coming our way. We used to have 2G, that has moved to 3G and to 4G, and that is a lot of bandwidth coming our way. We would like to have a tremendous amount of access and bandwidth speeds and so on, right?

Shridhar
So the human interaction and experience is improving vastly, given the networks that are growing. On the machine-to-machine (M2M) side, it’s going to be different. They don’t need oodles of bandwidth. About 80 to 90 percent of all machine interactions are going to be very, very low bandwidth – and, of course, low power. I will come to the low power in a moment, but it’s going to be very low bandwidth requirement.

In order to switch off a streetlight, how much bandwidth do you actually require? Or, in order to sense temperature or air quality or water and water quality, how much bandwidth do you actually require?

When you ask these questions, you get an answer that the machines don’t require that much bandwidth. More importantly, when there are millions -- or possibly billions -- of devices to be deployed in the years to come, how are you going to service a piece of equipment that is telling a streetlight to switch on and switch off if the battery runs out?

Machines are different from humans in terms of interactions. When we deploy machines that require low bandwidth and low power consumption, a battery can enable such a machine to communicate for years.

Aside from heavy video streaming applications or constant security monitoring, where low-bandwidth, low-power technology doesn’t work, the majority of the cases are all about low bandwidth and low power. And these machines can communicate with the quality of service that is required.

When it communicates, the network has to be available. You then need to establish a network that is highly available, which consumes very little power and provides the right amount of bandwidth. So studies show that less than 50 kbps connectivity should suffice for the majority of these requirements.

Now the machine interaction also means that you collect all of them into a platform and basically act on them. It's not about just sensing it, it's measuring it, analyzing it, and acting on it.

Low-power to the people

So the whole stack consists not just of connectivity alone. It’s LPWAN technology that is emerging now and is becoming a de facto standard as more-and-more countries start embracing it.

At Tata Communications we have embraced the LPWAN technology from the LoRa Alliance, a consortium of more than 400 partners who have gotten together and are driving standards. We are creating this network over the next 18 to 24 months across India. We have made these networks available right now in four cities. By the end of the year, it will be many more cities -- almost 60 cities across India by March 2018.

Gardner: Nigel, how do you see the opportunity, the market, for a standard architecture around this sort of low-power, low-bandwidth network? This is a proof of concept in India, but what's the potential here for taking this even further? Is this something that has global potential?
IoT Solutions for Communications Service Providers and Enterprises from HPE
Learn More
Upton: The global potential is undoubtedly there, and there is an additional element that we didn't talk about which is that not all devices require the same amount of bandwidth. So we have talked about video surveillance requiring higher bandwidth, we have talked about devices that have low-power bandwidth and will essentially be created once and forgotten when expected to last 5 or 10 years.

Upton
We also need to add in the aspect of security, and that really gave HPE and Tata the common ground of understanding that the world is made up of a variety of network requirements, some of which will be met by LPWAN, some of which will require more bandwidth, maybe as high as 5G.

The real advantage of being able to use a common architecture to be able to take the data from these devices is the idea of having things like a common management, common security, and a common data model so that you really have the power of being able to take information, take data from all of these different types of devices and pull it into a common platform that is based on a standard.

In our case, we selected the oneM2M standard, it’s the best standard available to be able to build that common data model and that's the reason why we deployed the oneM2M model within the universal IoT platform to get that consistency no matter what type of device over no matter what type of network.

Gardner: It certainly sounds like this is an unprecedented opportunity to gather insight and analysis into areas that you just really couldn't have measured before. So going back to the economics of this, Shridhar, have you had any opportunity through these pilot projects in such cities as Jamshedpur to demonstrate a return on investment, perhaps on street lighting, perhaps on quality of utilization and efficiency? Is there a strong financial incentive to do this once the initial hurdle of upfront costs is met?

Data-driven cost reduction lights up India

Unless the customer sees that there is a scope for either reducing the cost or increasing the customer experience, they are not going to buy these kinds of solutions.
Shridhar: Unless the customer sees that there is a scope for either reducing the cost or increasing the customer experience, they are not going to buy these kinds of solutions. So if you look at how things have been progressing, I will give you a few examples of how the costs have started constructing and playing out. One of course is to have devices, meeting at certain price point, we talked about how in India -- we talked that Nigel was remarking how constant still this Indian market is, but it’s important, once we delivered to a certain cost, we believe we can now deliver globally to scale. That’s very important, so if we build something in India it would deliver to the global market as well.

The streetlight example, let’s take that specifically and see what kind of benefits it would give. When a streetlight operates for about 12 hours a day, it costs about Rs.12, which is about $0.15, but when you start optimizing it and say, okay, this is a streetlight that is supported currently on halogen and you move it to LED, it brings a little bit of cost saving, in some cases significant as well. India is going through an LED revolution as you may have read in the newspapers, those streetlights are being converted, and that’s one distinct cost advantage.

Now they are looking and driving, let’s say, the usage and the electricity bills even lower by optimizing it. Let’s say you sync it with the astronomical clock, that 6:30 in the evening it comes up and let’s say 6:30 in the morning it shuts down linking to the astronomical clock because now you are connecting this controller to the Internet.

The second thing that you would do is during busy hours keep it at the brightest, let’s say between 7:00 and 10:00, you keep it at the brightest and after that you start minimizing it. You can control it down in 10 percent increments.

The point I am making is, you basically deliver intensity of light to the kind of requirement that you have. If it is busy, or if there is nobody on the street, or if there is a safety requirement -- a sensor will trigger up a series of lights, and so on.

So your ability to play around with just having streetlight being delivered to the requirement is so high that it brings down total cost. While I was telling you about $0.15 that you would spend per streetlight, that could be brought down to $0.05. So that’s the kind of advantage by better controlling the streetlights. The business case builds up, and a customer can save 60 to 70 percent just by doing this. Obviously, then the business case stands out.

The question that you are asking is an interesting one because each of the applications has its own way of returning the investment back, while the optimization of resources is being done. There is also a collateral positive benefit by saving the environment. So not only do I gain a business savings and business optimization, but I also pass on a general, bigger message of a green environment. Environment and safety are the two biggest benefits of implementing this and it would really appeal to our customers.

Gardner:It’s always great to put hard economic metrics on these things, but Shridhar just mentioned safety. Even when you can't measure in direct economics, it's invaluable when you can bring a higher degree of safety to an urban environment.

It opens up for more foot traffic, which can lead to greater economic development, which can then provide more tax revenue. It seems to me that there is a multiplier effect when you have this sort of intelligent urban landscape that creates a cascading set of benefits: the more data, the more efficiency; the more efficiency, the more economic development; the more revenue, the more data and so on. So tell us a little bit about this ongoing multiplier and virtuous adoption benefit when you go to intelligent urban environments?

Quality of life, under control

Upton:Yes, also it’s important to note that it differs almost by country to country and almost within region to region within countries. The interesting challenge with smart cities is that often you're dealing with elected officials rather than hard-nosed businessman who are only interested in the financial return. And it's because you're dealing with politicians and they are therefore representing the citizens in their area, either their city or their town or their region, their priorities are not always the same.

There is quite a variation of one of the particular challenges, particular social challenges as well as the particular quality of life challenges in each of the areas that they work in. So things like personal safety are a very big deal in some regions. I am currently in Tokyo and here there is much more concern around quality of life and mobility with a rapidly aging population and their challenges are somewhat different.
IoT Solutions for Communications Service Providers and Enterprises from HPE
Learn More
But in India, the set of opportunities and challenges that are set out, they are in that combination of economic as well as social, and if you solve them and you essentially give citizens more peace of mind, more ability to be able to move freely, to be able to take part in the economic interaction within that area, then undoubtedly that leads to greater growth, but it is worth bearing in mind that it does vary almost city by city and region by region.

Gardner:Shridhar, do you have any other input into a cascading ongoing set of benefits when you get more data, more network opportunity. I guess I am trying to understand for a longer-term objective that being intelligent and data-driven has an ongoing set of benefits, what might those be? How can this be a long-term data and analytics treasure trove when you think about it in terms of how to provide better urban experiences?

Home/work help

Shridhar:From our perspective, when we looked at the customer benefits there is a huge amount of focus around the smart cities and how smart cities are benefiting from a network. If you look at the enterprise customers, they are also looking at safety, which is an overlapping application that a smart city would have.

So the enterprise wants to provide safety to its workers, for example, in mines or in difficult terrains, environments where they are focusing on helping them. Or women’s safety, which is as you know in India is a big thing as well -- how do you provide a device which is not very obvious and it gives the women all the safety that is there.

So all this in some form is providing data. One of the things that comes to my mind when you ask about how data-driven resources can be and what kind of quality it would give is if you action your mind to some of the customer services devices, there could be applications or let’s say a housewife could have a multiple button kind of a device where she can order a service.

Depending on the service she presses and an aggregate of households across India, you would know the trends and direction of a certain service, and mind you, it could be as simple as a three-button device which says Service A, Service B, Service C, and it could be a consumer service that gets extended to a particular household that we sell it as a service.

So you could get lots of trends and patterns that are emerging from that, and we believe that the customer experience is going to change, because no longer is a customer going to retain in his mind what kind of phone numbers or your, let's say, apps and all to order, you give them the convenience of just a button-press service. That immediately comes to my mind.

Feedback fosters change

The second one is in terms of feedback. You use the same three-button service to say, how well have you used utility -- or rather how -- what kind of quality of service that you rate multiple utilities that you are using, and there is toilet revolution in India. For example, you put these buttons out there, they will tell you at any given point of time what’s the user satisfaction and so on.

So these are all data that is getting gathered and I believe that while it is early days for us to go on and put out analytics and give you distinct kind of benefits that are there, but some of the things that customers are already looking at is which geographies, which segment, who are my biggest -- profile of the customers using this and so on. That kind of information is going to come out very, very distinctly.

The Smart Cities is all about experience. The enterprises are now looking at the data that is coming out and seeing how they can use it to better segment, and provide better customer experience which would obviously mean both adding to their top line as well as helping them manage their bottom line. So it's beyond safety, it's getting into the customer experience – the realm of managing customer experience.

Gardner:From a go-to-market perspective, or a go-to-city’s perspective, these are very complex undertakings, lots of moving parts, lots of different technologies and standards. How are Tata and HPE are coming together -- along with other service providers, Pointnextfor example? How do you put this into a package that can then actually be managed and put in place? How do we make this appealing not only in terms of its potential but being actionable as well when it comes to different cities and regions?

Upton:The concept of Smart Cities has been around for a while and various governments around the world have pumped money into their cities over an extended period of time.
We now have the infrastructure in place, we have the price points and we have IoT becoming mainstream.

As usual, these things always take more time than you think, and I do not believe today that we have a technology challenge on our hands. We have much more of a business model challenge. Being able to deploy technology to be able to bring benefits to citizens, I think that is finally getting to the point where it is much better understood where innovation of the device level, whether it's streetlights, whether it's the ability to measure water quality, sound quality, humidity, all of these metrics that we have available to us now. There has been very rapid innovation at that device level and at the economics of how to produce them, at a price that will enable widespread deployment.

All that has been happening rapidly over the last few years getting us to the point where we now have the infrastructure in place, we have the price points in place, and we have IoT becoming mainstream enough that it is entering into the manufacturing process of all sorts of different devices, as I said, ranging from streetlights to personal security devices through to track and trace devices that are built into the manufacturing process of goods.
That is now reaching mainstream and we are now able to take advantage of this massive data that’s now being produced to be able to produce even more efficient and smarter cities, and make them safer places for our citizens.

Gardner:Last word to you, Shridhar. If people wanted to learn more about the pilot proof of concept (PoC) that you are doing there at Jamshedpur and other cities, through the Smart Cities Mission, where might they go, are there any resources, how would you provide more information to those interested in pursuing more of these technologies?

Pilot projects take flight

Shridhar:I would be very happy to help them look at the PoCs that we are doing. I would classify the PoCs that we are doing is as far as safety is concerned, we talked of energy management in one big bucket that is there, then the customer service I spoke about, the fourth one I would say is more on the utility side. Gas and water are two big applications where customers are looking at these PoCs very seriously.

And there is very one interesting application in that one customer wanted for pest control, where he wanted his mouse traps to have sensors so that they will at any point of time know if there is a rat trap at all, which I thought was a very interesting thing.
IoT Solutions for Communications Service Providers and Enterprises from HPE
Learn More
There are multiple streams that we have, we have done multiple PoCs, we will be very happy as Tata Communications team [to provide more information], and the HPE folks are in touch with us.

You could write to us, to me in particular for some period of time. We are also putting information on our website. We have marketing collateral, which describes this. We will do some of the joint workshops with HPE as well.

So there are multiple ways to reach us, and one of the best ways obviously is through our website. We are always there to provide more important help, and we believe that we can’t do it all alone; it’s about the ecosystem getting to know and getting to work on it.

While we have partners like HPE on the platform level, we also have partners such as Semtech, who established Center of Excellence in Mumbai along with us. So the access to the ecosystem from HPE side as well as our other partners is available, and we are happy to work and co-create the solutions going forward.


          How confluence of cloud, UC and data-driven insights newly empowers contact center agents        
The next BriefingsDirect customer experience insights discussion explores how Contact center-as-a-service (CCaaS) capabilities are becoming more powerful as a result of leveraging cloud computing, multi-mode communications channels, and the ability to provide optimized and contextual user experiences.

More than ever, businesses have to make difficult and complex decisions about how to best source their customer-facing services. Which apps and services, what data and resources should be in the cloud or on-premises -- or in some combination -- are among the most consequential choices business leaders now face. As the confluence of cloud and unified communications (UC) -- along with data-driven analytics -- gain traction, the contact center function stands out.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or  download a copy. 

We’ll now hear why traditional contact center technology has become outdated, inflexible and cumbersome, and why CCaaS is becoming more popular in meeting the heightened user experience requirements of today.
Here to share more on the next chapter of contact center and customer service enhancements, is Vasili Triant, CEO of Serenovain Austin, Texas. The discussion is moderated by Dana Gardner, principal analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What are the new trends reshaping the contact center function?

Triant:What’s changed in the world of contact center and customer service is that we’re seeing a generational spread -- everything from baby boomers all the way now to Gen Z.

With the proliferation of smartphones through the early 2000s, and new technologies and new channels -- things like WeChat and Viber -- all these customers are now potential inbound discussions with brands. And they all have different mediums that they want to communicate on. It’s no longer just phone or e-mail: It’s phone, e-mail, web chat, SMS, WeChat, Facebook, Twitter, LinkedIn, and there are other channels coming around the corner that we don't even know about yet.

Triant
When you take all of these folks -- customers or brands -- and you take all of these technologies that consumers want to engage with across all of these different channels – it’s simple, they want to be heard. It's now the responsibility of brands to determine what is the best way to respond and it’s not always one-to-one.

So it’s not a phone call for a phone call, it’s maybe an SMS to a phone call, or a phone call to a web chat -- whatever those [multi-channels] may be. The complexity of how we communicate with customers has increased. The needs have changed dramatically. And the legacy types of technologies out there, they can't keep up -- that's what's really driven the shift, the paradigm shift, within the contact center space.

Gardner:It’s interesting that the new business channels for marketing and capturing business are growing more complex. They still have to then match on the back end how they support those users, interact with them, and carry them through any sort of process -- whether it's on-boarding and engaging, or it’s supporting and servicing them.

What we’re requiring then is a different architecture to support all of that. It seems very auspicious that we have architectural improvements right along with these new requirements.

Triant:We have two things that have collided at the same time – cloud technologies and the growth of truly global companies.  

Most of the new channels that have rolled out are in the cloud. I mean, think about it -- Facebook is a cloud technology, Twitter is a cloud technology. WeChat, Viber, all these things, they are all cloud technologies. It’s becoming a Software-as-a-Service (SaaS)-based world. The easiest and best way to integrate with these other cloud technologies is via the cloud -- versus on-premises. So what began as the shift of on-premises technology to cloud contact center -- and that really began in 2011-2012 – has rapidly picked up speed with the adoption of multi-channels as a primary method of communication.

The only way to keep up with the pace of development of all these channels is through cloud technologies because you need to develop an agile world, you need to be able to get the upgrades out to customers in a quick fashion, in an easy fashion, and in an inexpensive fashion. That's the core difference between the on-premises world and the cloud world.

At the same time, we are no longer talking about a United States company, an Australia company, or a UK company -- we are talking about everything as global brands, or global businesses. Customer service is global now, and no one cares about borders or countries when it comes to communication with a brand.
Customer service is global now, and no one cares about borders or countries when it comes to communications with a brand.


Gardner:We have been speaking about this through the context of the end-user, the consumer. But this architecture and its ability to leverage cloud also benefits the agent, the person who is responsible for keeping that end-user happy and providing them with the utmost in intelligent services. So how does the new architecture also aid and abet the agent.

Triant: The agent is frankly one of the most important pieces to this entire puzzle. We talk a lot about channels and how to engage with the customer, but that's really what we call listening. But even in just simple day-to-day human interactions, one of the most important things is how you communicate back. There has been a series of time-and-motion studies done within contact centers, within brands -- and you can even look at your personal experiences. You don’t have to read reports to understand this.
The baseline for how an interaction will begin and end and whether that will be a happy or a poor interaction with the brand, is going to be dependent on the agents’ state of mind. If I call up and I speak to “Joe,” and he starts the conversation, he is in a great mood and he is having a great day, then my conversation will most likely end in a positive interaction because it started that way.

But if someone is frustrated, they had a rough day, they can’t find their information, their computers have been crashing or rebooting, then the interaction is guaranteed to end up poor. You hear this all the time, “Oh, can you wait a moment, my systems are loading. Oh, I can’t get you an answer, that screen is not coming up. I can't see your account information.” The agents are frustrated because they can’t do their job, and that frustration then blends into your conversation.

So using the technology to make it easy for the agent to do their job is essential. If they have to go from one screen to another screen to conduct one interaction with the customer -- they are going to be frustrated, and that will lead to a poor experience with the customer.

The cloud technologies like Serenova, which is web-based, are able to bring all those technologies into one screen. The agent can have all the information brought to them easily, all in one click, and then be able to answer all the customer needs. The agent is happy and that adds to the customer satisfaction. The conclusion of the call is a happy customer, which is what we all want. That’s a great scenario and you need cloud technology to do that because the on-premises world does not deliver a great agent experience.

One-stop service

Gardner:Another thing that the older technologies don't provide is the ability to have a flexible spectrum to move across these channels. Many times when I engage with an organization I might start with an SMS or a text chat, but then if that can’t satisfy my needs, I want to get a deeper level of satisfaction. So it might end up going to a phone call or an interaction on the web, or even a shared desktop, if I’m in IT support, for example.

The newer cloud technology allows you to intercept via different types of channels, but you can also escalate and vary between and among them seamlessly. Why is that flexibility both of benefit to the end-user as well as the agent?

Triant: I always tell companies and customers of ours that you don't have to over-think this; all you have to do is look to your personal life. Most common things that we as users deal with -- such as cell phone companies, cable companies, airlines, -- you can get onto any of these websites and begin chatting, but you can find that your interaction isn’t going well. Before I started at Serenova, I had these experiences where I was dealing with the cable company and -- chat, chat, chat, -- trying to solve my problem. But we couldn't get there, and so then we needed to get on the phone. But they said, “Here is our 800 number, call in.” I’d call in, but I’d have to start a whole new interaction.

Basically, I’d have to re-explain my entire situation. Then, I am talking with one person, and they have to turn around and send me an email, but I am not going to get that email for 30 to 45 minutes because they have to get off the phone, and get into another system and send it off. In the meantime, I am frustrated, I am ticked off -- and guess what I have done now? I have left that brand. This happens across the board. I can even have two totally different types of interactions with the company.

You can use a major airline brand as an example. One of our employees called on the phone trying to resolve an issue that was caused by the airline. They basically said, “No, no, no.” It made her very frustrated. She decided she’s going to fly with a different airline now. She then sent a social post [to that effect], and the airline’s VP of Customer Service answered it, and within minutes they had resolved her issue. But they already spent three hours on the phone trying to push her off through yet another channel because it was a totally different group, a totally different experience.

By leveraging technologies where you can pivot from one channel to another, everyone will get answers quicker. I can be chatting with you, Dana, and realize that we need to escalate to a voice conversation, for example, and I as the agent; I can then turn that conversation into a voice call. You don't have to re-explain yourself and you are like, “Wow, that's cool! Now I’m on the phone with a facility,” and we are able to handle our business.

As agent, I can also pivot simultaneously to an email channel to send you something as simple as a user guide or a series of knowledge-based articles that I may have at my fingertips as an agent. But you and I are still on the phone call. Even better yet, after-the-fact, as a business, I have all the analytics and the business intelligence to say that I had one interaction with Dana that started out as a web chat, pivoted to a phone call, and I simultaneously then sent a knowledge-based article of “X” around this issue and I can report on it all at once. Not three separate interactions, not three separate events -- and I have made you a happy customer.

Gardner:We are clearly talking about enabling the agent to be a super-agent, and they can, of course, be anywhere. I think this is really important now because the function of an agent -- we are already seeing the beginnings of this -- but it's going to certainly include and increase having more artificial intelligence (AI) and machine learning and associated data analytics benefits. The agent then might be a combination of human and AI functions and services.

So we need to be able to integrate at a core communications basis. Without going too far down this futuristic route, isn't it important for that agent to be an assimilation of more assets and more services over time?

Artificial Intelligence plus human support

Triant:I‘m glad you brought up AI and these other technologies. The reality is that we've been through a number of cycles around what this technology is going to do and how it is going to interact with an agent. In my view, and I have been in this world for a while, the agent is the most important piece of customer service and brand engagement. But you have to be able to bring information to them, and you have to be able to give information to your customers so that if there is something simple, get it to them as quick as possible -- but also bring all the relevant information to the agent.

AI has had multiple forms; it has existed for a long time. Sometimes people get confused because of marketing schemes and sales tactics [and view AI] as a way for cost avoidance, to reduce agents and eliminate staff by implementing these technologies. Really the focus is how to create a better customer experience, how to create a better agent experience.

We have had AI in our product for last three years, and we are re-releasing some components that will bring business intelligence to the forefront around the end of the year. What it essentially does is alIow you to see what you're doing as a user out on the Internet and within these technologies. I can see that you have been looking for knowledge-based articles around, for example, “why my refrigerator keeps freezing up and how can I defrost it.” You can see such things on Twitter and you can see these things on Facebook. The amount of information that exists out there is phenomenal and in real-time. I can now gather that information … and I can proactively, as a business, make decisions about what I want to do with you as a potential consumer.

I can even identify you as a consumer within my business, know how many products you have acquired from me, and whether you're a “platinum” customer or even a basic customer, and then make a decision.

For example, I have TVs, refrigerators, washer-dryers and other appliances all from the same manufacturer. So I am a large consumer to that one manufacturer because all of my components are there. But I may be searching a knowledge-based article on why the refrigerator continues to freeze up.

Now I may call in about just the refrigerator, but wouldn't it be great for that agent to know that I own 22 other products from that same company? I'm not just calling about the refrigerator; I am technically calling about the entire brand. My experience around the refrigerator freaking out may change my entire brand decision going forward. That information may prompt me to decide that I want to route that customer to a different pool of agents, based on what their total lifetime value is as a brand-level consumer.

Through AI, by leveraging all this information, I can be a better steward to my customer and to the agent, because I will tell you, an agent will act differently if they understand the importance of that customer or to know that I, Vasili, have spent the last two hours searching online for information, which I posted on Facebook and I posted on Twitter.
Through AI, by leveraging all this information, I can be a better steward to the customer and to the agent.

At that point, the level of my frustration already has reached a certain height on a scale. As an agent, if you knew that, you might treat me differently because you already know that I am frustrated. The agent may be able to realize that you have been looking for some information on this, realize you have been on Facebook and Twitter. They can then say: “I am really sorry, I'm not able to get you answers. Let me see how I can help you, it seems that you are looking online about how to keep the refrigerator from freezing up.”

If I start the conversation that way, I've now diffused a lot of the frustration of the customer. The agent has already started that interaction better. Bringing that information to that person, that’s powerful, that’s business intelligence -- and that’s creating action from all that information.

Keep your cool

Gardner:It’s fascinating that that level of sentiment analysis brings together the best of what AI and machine learning can do, which is to analyze all of these threads of data and information and determine a temperature, if you will, of a person's mood and pass that on to a human agent who can then have the emotional capacity to be ready to help that person get to a lower temperature, be more able to help them overall.

It’s becoming clear to me, Vasili, that this contact center function and CCaaS architectural benefits are far more strategic to an organization than we may have thought, that it is about more than just customer service. This really is the best interface between a company -- and all the resources and assets it has across customer service, marketing, and sales interactions. Do you agree that this has become far more strategic because of these new capabilities?

Triant:Absolutely, and as brands begin to realize the power of what the technology can do for their overall business, it will continue to evolve, and gain pace around global adoption.
As brands begin to realize the power of what the technology can do for their overall businesses, it will continue to evolve and gain global adoption.

We have only scratched the surface on adoption of these cloud technologies within organizations. A majority of brands out there look at these interactions as a cost of doing business. They still seek to reduce that cost versus the lifetime value of both the consumer, as well as the agent experience. This will shift, it is shifting, and there are companies that are thriving by recognizing that entire equation and how to leverage the technologies.

Technology is nothing without action and result. There have been some really cool things that have existed for a while, but they don’t ever produce any result that’s meaningful to the customer so they never get adopted and deployed and ultimately reach some type of a mass proliferation of results.

Gardner:You mentioned cost. Let’s dig into that. For organizations that are attracted to the capabilities and the strategic implications of CCaaS, how do we evaluate it in terms of cost? The old CapEx approach often had a high upfront cost, and then high operating costs, if you have an inefficient call center. Other costs involve losing your customers, losing brand affinity, losing your perception in the market. So when you talk to a prospect or customer, how do you help them tease out the understanding of a pay-as-you-go service as highly efficient? Does the highly empowered agent approach save money, or even make money, and CCaaS becomes not a cost center but a revenue generator?

Cost consciousness

Triant:Interesting point, Dana. When I started at Serenova about five years ago, customers all the time would say, “What’s the cost of owning the technology?” And, “Oh, my, on-premises stuff has already depreciated and I already own it, so it’s cheaper for me to keep it.” That was the conversation pretty much every day. Beginning in 2013, it rapidly started shifting. This shift was mainly driven by the fact that organizations started realizing that consumers want to engage on different channels, and the on-premises guys couldn’t keep up with this demand.

The cost of ownership no longer matters. What matters is that the on-premises guys just literally could not deliver the functionality. And so, whether that's Cisco, Avaya, or Shoretel, they quickly started falling away in consideration for technology companies that were looking to deploy applications for their business to meet these needs.

The cost of ownership quickly disappeared as the main discussion point. Instead it came around to, “What is the solution that you're going to deliver?” Customers that are looking for contact center technologies are beginning to take a cloud-first approach. And once they see the power of CCaaS through demonstration and through some trials of what an agent can do – and it’s all browser-based, there is no client install, there is no equipment on-premises - then it takes on a life of its own. It’s about, “What is the experience going to be? Are these channels all integrated? Can I get it all from one manufacturer?”

Following that, organizations focus on other intricacies around - Can it scale? Can it be redundant? Is it global? But those become architectural concerns for the brands themselves. There is a chunk of the industry that is not looking at these technologies, and they are stuck in brand euphoria or have to stay with on-premises infrastructure, or with a certain vendor because of their name or that they are going to get there someday.

As we have seen, Avaya has declared bankruptcy. Avaya does not have cloud technologies despite their marketing message. So the customers that are in those technologies now realize they have to find a path to keep up with the basic customer service at a global scale. Unfortunately, those customers have to find a path forward and they don’t have one right now.
It's less about cost of ownership and it’s more about the high cost of not doing anything. If I don't do anything, what’s going to be the cost? That cost ultimately becomes - I’m not going to be able to have engagement with my customers because the consumers are changing.
It's less about cost of ownership and it's more about the high cost of not doing anything.

Gardner:What about this idea of considering your contact center function not just as a cost center, but also as a business development function? Am I being too optimistic.

It seems to me that as AI and the best of what human interactions can do combine across multichannels, that this becomes no longer just a cost center for support, a check-off box, but a strategic must-do for any business.

Multi-channel customer interaction

Triant:When an organization reaches the pinnacle of happiness within what these technologies can do, they will realize that no longer do you need to have delineation between a marketing department that answers social media posts, an inside sales department that is only taking calls for upgrades and renewals, and a customer service department that’s dealing with complaints or inbound questions. They will see that you can leverage all the applications across a pool of agents with different skills.

I may have a higher skill around social media than over voice, or I may have a higher skill level around a sales activity, or renewal activity, over customer service problems. I should be able to do any interaction. And potentially one day it'll just be customer interaction department and the channels are just a medium of inbound and outbound choice for a brand.

But you can now take information from whatever you see the customer doing. Each of their actions have a leading indicator, everything has a predictive action prior to the inbound touch, everything does. Now that a brand can see that, it will be able to have “consumer interaction departments,” and it will be properly routed to the right person based on that information. You’ll be able to bring information to that agent that will allow them to answer the customer’s questions.

Gardner:I can see how that agent’s job would be very satisfying and fulfilling when you are that important, when you have that sort of a key role in your organization that empowers people. That’s good news for people that are trying to find those skills and fill those positions.

Vasili, we only have a few minutes left, but I’d love to hear about a couple of examples. It’s one thing to tell, it’s another thing to show. Do we have some examples of organizations that have embraced this concept of a strategic contact center, taken advantage of those multi-channels, added perhaps some intelligence and improved the status and capability of the agents -- all to some business benefit? Walk us through a couple of actual use cases where this has all come together.

Cloud communication culture shift

Triant:No one has reached that level of euphoria per se, but there are definitely companies that are moving in that direction.

It is a culture change, so it takes time. I know as well as anybody what it takes to shift a culture, and it doesn't happen overnight. As an example, there is a ride-hailing company that engages in a different way with their consumer, and their consumer might be different than what you think from the way I am describing it. They use voice systems and SMS and often want to pivot between the two. Our technology actually allows the agent to make that decision even if they aren’t even physically in the same country. They are dynamically spread across multiple countries to answer any question they may need to answer based on time and day.

But they can pivot from what’s predominantly an SMS inbound and outbound communication into a voice interaction, and then they can also follow up with an e-mail, and that’s already happened. Now, it initially started with some SMS inbound and outbound, then they added voice – an interesting move as most people think adding voice is what people are getting away from. What everyone has begun to realize is that live communication ultimately is what everybody looks for in the end to solve the more complex problems.
What everyone has begun to realize is that live communication ultimately is what everybody looks for in the end to solve the more complex problems.

That's one example. Another company that provides the latest technology in food order and delivery initially started with voice-only to order and deliver food. Now they've added SMS confirmations automatically, and e-mail as well for confirmation or for more information from the inbound voice call. And now, once they are an existing customer, they can even start an order from an SMS, and pivot back to a voice call for confirmation -- all within one interaction. They are literally one of the fastest growing alternative food delivery companies, growing at a global scale.

They are deploying agents globally across one technology. They would not be able to do this with legacy technologies because of the expense. When you get into these kinds of high-volume, low-margin businesses, cost matters. When you can have an OpEx model that will scale, you are adding better customer service to the applications, and you are able to allow them to build a profitable model because you are not burning them with high CapEx processes.

Gardner:Before we sign off, you had mentioned your pipeline about your products and services, such as engaging more with AI capabilities toward the end of the year. Could give us a level-set on your roadmap? Where are your products and services now? Where do you go next?

A customer journey begins with insight

Triant:We have been building cloud technologies for 16 years in the contact center space. We released our latest CCaaS platform in March 2016 called CxEngage. We then had a major upgrade to the platform in March of this year, where we take that agent experience to the next level. It’s really our leapfrog in the agent interface and making it easier, bringing in more information to them.

Where we are going next is around the customer journey -- predictive interactions. Some people call it AI, but I will call it “customer journey mapping with predictive action insights.” That’s going to be a big cornerstone in our product, including business analytics. It’s focused around looking at a combination of speech, data and text -- all simultaneously creating predictive actions. This is another core area we are going in an and continue to expand the reach of our platform from a global scale.

At this point, we are a global company. We have the only global cloud platform built on a single software stack with one data pipeline. We now have more users on a pure cloud platform than any of our competitors globally. I know that’s a big statement, but when you look at a pure cloud infrastructure, you're talking in a whole different realm of what services you are able to offer to customers. Our ability to provide a broad reach including to Europe, South Africa, Australia, India, and Singapore -- and still deliver good cloud quality at a reasonable cost and redundant fashion –  we are second to none in that space.

Gardner:I’m afraid we will have to leave it there. We have been listening to a sponsored BriefingsDirect discussion on how CCaaS capabilities are becoming more powerful as a result of cloud computing, multimode communications channels, and the ability to provide optimized and contextual user experiences.

And we’ve learned how new levels of insight and intelligence are now making CCaaS approaches able to meet the highest user experience requirements of today and tomorrow. So please join me now in thanking our guest, Vasili Triant, CEO of Serenova in Austin, Texas.

Triant:Thank you very much, Dana. I appreciate you having me today.

Gardner:This is Dana Gardner, Principal Analyst at Interarbor Solutions, your host and moderator for this ongoing series of BriefingsDirect discussions. A big thank you to our sponsor, Serenova, as well as to you, our audience. Do come back next time and thanks for listening.

Listen to the podcast. Find it on