XML.com

One Month of XML Slack

May 17, 2020

Adam Retter

The XML Slack channel has been available for a month and the community is growing. Adam Retter uses XQuery to show the most common topics of discussion.

About a month ago when I set up the XML Slack Workspace, I did so because I thought there was a need for a real-time communication mechanism. Whilst various products and projects have their own chat systems, there didn't appear to be something independent for the larger community.

I have to admit that I was not sure how or if this would work out. I wondered if people would come and share their time and help each other. Perhaps people were people already happy in their own sub-communities? or already too busy with other concerns? One scenario that I envisaged was of an empty street with tumbleweeds rolling around.

So... now that we are one month in, how is it working out?

I am very happy :-) The number of users has exceeded my expectations, and there have been some excellent discussions. Users are genuinely supporting and helping each other with questions and issues that they face. Apart from Q/A there has also been some fun general discussion and debating.

Just a few days ago I received the following positive encouragement from Slack:

XML Slack is on a roll
XML Slack is on a roll

In a little over a month, the XML.com Workspace has become a community of 197 members who have have exchanged over 2,600 messages.

Active members
XML Slack analytics

I think our group of users represent a great mix of abilities and skills. We have some heavy hitters (you know who you are!) who have authored books, standards, and the software tools that many work with daily. We also have plenty of beginners who are perhaps just starting out or dipping-a-toe into various XML technologies. Then of course in the middle are those users that are daily practitioners of XML technologies, those whose jobs involve applying the techniques and technologies.

As a causal observation, the most popular topic so far seems to be XSLT. Apart from XML itself, it is arguably the most successful of the XML technologies, and so I was happy to see that our community is helpful for XSLT users. However, I wondered what other topics were most popular, and so I took an export of all of the (public) conversations from Slack, and wrote some very simple and rough XQuery to extract the meaningful text and generate a Wordcloud.

xquery version "3.1";

import module namespace file = "http://exist-db.org/xquery/file";
import module namespace process = "http://exist-db.org/xquery/process";
import module namespace util = "http://exist-db.org/xquery/util";

declare variable $slack-export-data-dir := "file:///Users/aretter/tmp-code/xml-slack-stats/data/202005171235";
declare variable $txt-file := "/tmp/xml-slack-text.txt";
declare variable $image-file := "/tmp/xml-slack-wordcloud.png";

let $txt := string-join(
let $channels-json := fn:unparsed-text($slack-export-data-dir || "/channels.json")
let $channel-names := fn:json-to-xml($channels-json)/fn:array/fn:map/fn:string[@key eq "name"]/string(.)
for $channel-name in "general" (: $channel-names :)
let $channel-dir := $slack-export-data-dir || "/" || $channel-name
let $channel-json-files := file:directory-list($channel-dir, "*.json")//file:file/@name/string(.)
return
for $channel-json-file in $channel-json-files
let $channel-json := fn:unparsed-text($channel-dir || "/" || $channel-json-file)
let $channel-data := fn:json-to-xml($channel-json)
let $messages := $channel-data/fn:array/fn:map/fn:string[@key eq "text"]/string(.)
let $clean-messages := $messages 
! replace(., "<@U[0-9A-Z]+>\s*", "")  (: remove @user mentions :)
! replace(., "<(http(s)?|(file)|(mailto))://[^|>]+(\|[^>]+)?>\s*", "") (: remove hyperlinks :)
! replace(., "```[^`]+```\s*", "") (: remove block code fragments :)
! replace(., "`[^`]+`\s*", "") (: remove inline code fragments :)
! replace(., ":[a-z0-9_]+:", "")                                                (: remove Emoji :)   

return
$clean-messages
, " ")
return
let $_ := file:serialize-binary(util:base64-encode($txt) cast as xs:base64Binary, $txt-file)
return
process:execute(("wordcloud_cli", "--text", $txt-file, "--fontfile", "/System/Library/Fonts/Optima.ttc", "--width", "800", "--imagefile", $image-file), ())
XML Slack Wordcloud
XML Slack Wordcloud

It communicates an interesting picture of the current use of our XML Slack Workspace. We can see that indeed XSLT is the most common keyword, and that XQuery is perhaps second. We also see some other niceties like "I'm" and "Thank", likely related to good questions and answers. We also see some nice coincidental strings like "Yes Saxon"!

I also wondered who the most active users were, I wrote another little XQuery to work this out:

xquery version "3.1";

import module namespace file = "http://exist-db.org/xquery/file";
import module namespace process = "http://exist-db.org/xquery/process";
import module namespace util = "http://exist-db.org/xquery/util";

declare variable $slack-export-data-dir := "file:///Users/aretter/tmp-code/xml-slack-stats/data/202005171235";

let $users-json := fn:unparsed-text($slack-export-data-dir || "/users.json")
let $users := fn:json-to-xml($users-json)/fn:array/fn:map

let $messages := (

let $channels-json := fn:unparsed-text($slack-export-data-dir || "/channels.json")
let $channel-names := fn:json-to-xml($channels-json)/fn:array/fn:map/fn:string[@key eq "name"]/string(.)
for $channel-name in "general" (: $channel-names :)
let $channel-dir := $slack-export-data-dir || "/" || $channel-name
let $channel-json-files := file:directory-list($channel-dir, "*.json")//file:file/@name/string(.)
return
for $channel-json-file in $channel-json-files
let $channel-json := fn:unparsed-text($channel-dir || "/" || $channel-json-file)
let $channel-data := fn:json-to-xml($channel-json)
return
$channel-data/fn:array/fn:map[fn:string[@key eq "type"][. eq "message"]]
)
return
let $user-ids := distinct-values($messages/fn:string[@key eq "user"]/string(.))
for $user-id in $user-ids
let $messages := count($messages[fn:string[@key eq "user"][. eq $user-id]])
let $user-name := $users[fn:string[@key eq "id"][. eq $user-id]]/fn:string[@key eq "name"]/string(.)
order by $messages descending
return 
<user messages="{$messages}">{$user-name}</user>

I haven't published the results of that one, as I am not sure everyone would be comfortable with me pasting their names here. So I will leave it for you to work it out by joining the community ;-)

So far we have avoided creating topic specific channels, as traffic is not overwhelming in the General channel, and users have been very good at making use of Slack's thread facilities. Obviously we may need to review this as time goes by, but at the moment I think we have struck a good balance between inclusion and digestibility.

One last thing to mention, Lauren Wood and I (Adam Retter) are the Slack admins. So far our job has been easy, we have had no violations of our Code of Conduct. We needed to gently remind one new user that this wasn't really the place for unsolicited advertising of their product. To be clear we are happy for people to announce new releases of their software and tool etc, but it has to be within the spirit of the community. If anybody does need to talk to us in our role as admins, feel free to reach out.