Fix new tests and make TestLWN work
[gofetch.git] / test / expected / LWN / 0000763252
1 LWN.NET WEEKLY EDITION FOR AUGUST 30, 2018
2
3
4
5 o News link: https://lwn.net/Articles/763252/
6 o Source link:
7
8
9 [1]Welcome to the LWN.net Weekly Edition for August 30, 2018
10 This edition contains the following feature content:
11
12 [2]An introduction to the Julia language, part 1 : Julia is a
13 language designed for intensive numerical calculations; this
14 article gives an overview of its core features.
15
16 [3]C considered dangerous : a Linux Security Summit talk on
17 what is being done to make the use of C in the kernel safer.
18
19 [4]The second half of the 4.19 merge window : the final
20 features merged (or not merged) before the merge window closed
21 for this cycle.
22
23 [5]Measuring (and fixing) I/O-controller throughput loss : the
24 kernel's I/O controllers can provide useful bandwidth
25 guarantees, but at a significant cost in throughput.
26
27 [6]KDE's onboarding initiative, one year later : what has gone
28 right in KDE's effort to make it easier for contributors to
29 join the project, and what remains to be done.
30
31 [7]Sharing and archiving data sets with Dat : an innovative
32 approach to addressing and sharing data on the net.
33
34 This week's edition also includes these inner pages:
35
36 [8]Brief items : Brief news items from throughout the
37 community.
38
39 [9]Announcements : Newsletters, conferences, security updates,
40 patches, and more.
41
42 Please enjoy this week's edition, and, as always, thank you
43 for supporting LWN.net.
44
45 [10]Comments (none posted)
46
47 [11]An introduction to the Julia language, part 1
48
49 August 28, 2018
50
51 This article was contributed by Lee Phillips
52
53 [12]Julia is a young computer language aimed at serving the
54 needs of scientists, engineers, and other practitioners of
55 numerically intensive programming. It was first publicly
56 released in 2012. After an intense period of language
57 development, version 1.0 was [13]released on August 8. The 1.0
58 release promises years of language stability; users can be
59 confident that developments in the 1.x series will not break
60 their code. This is the first part of a two-part article
61 introducing the world of Julia. This part will introduce
62 enough of the language syntax and constructs to allow you to
63 begin to write simple programs. The following installment will
64 acquaint you with the additional pieces needed to create real
65 projects, and to make use of Julia's ecosystem.
66
67 Goals and history
68
69 The Julia project has ambitious goals. It wants the language
70 to perform about as well as Fortran or C when running
71 numerical algorithms, while remaining as pleasant to program
72 in as Python. I believe the project has met these goals and is
73 poised to see increasing adoption by numerical researchers,
74 especially now that an official, stable release is available.
75
76 The Julia project maintains a [14]micro-benchmark page that
77 compares its numerical performance against both statically
78 compiled languages (C, Fortran) and dynamically typed
79 languages (R, Python). While it's certainly possible to argue
80 about the relevance and fairness of particular benchmarks, the
81 data overall supports the Julia team's contention that Julia
82 has generally achieved parity with Fortran and C; the
83 benchmark source code is available.
84
85 Julia began as research in computer science at MIT; its
86 creators are Alan Edelman, Stefan Karpinski, Jeff Bezanson,
87 and Viral Shah. These four remain active developers of the
88 language. They, along with Keno Fischer, co-founder and CTO of
89 [15]Julia Computing , were kind enough to share their thoughts
90 with us about the language. I'll be drawing on their comments
91 later on; for now, let's get a taste of what Julia code looks
92 like.
93
94 Getting started
95
96 To explore Julia initially, start up its standard
97 [16]read-eval-print loop (REPL) by typing julia at the
98 terminal, assuming that you have installed it. You will then
99 be able to interact with what will seem to be an interpreted
100 language — but, behind the scenes, those commands are being
101 compiled by a just-in-time (JIT) compiler that uses the
102 [17]LLVM compiler framework . This allows Julia to be
103 interactive, while turning the code into fast, native machine
104 instructions. However, the JIT compiler passes sometimes
105 introduce noticeable delays at the REPL, especially when using
106 a function for the first time.
107
108 To run a Julia program non-interactively, execute a command
109 like: $ julia script.jl
110
111 Julia has all the usual data structures: numbers of various
112 types (including complex and rational numbers),
113 multidimensional arrays, dictionaries, strings, and
114 characters. Functions are first-class: they can be passed as
115 arguments to other functions, can be members of arrays, and so
116 on.
117
118 Julia embraces Unicode. Strings, which are enclosed in double
119 quotes, are arrays of Unicode characters, which are enclosed
120 in single quotes. The " * " operator is used for string and
121 character concatenation. Thus 'a' and 'β' are characters, and
122 'aβ' is a syntax error. "a" and "β" are strings, as are "aβ",
123 'a' * 'β', and "a" * "β" — all evaluate to the same string.
124
125 Variable and function names can contain non-ASCII characters.
126 This, along with Julia's clever syntax that understands
127 numbers prepended to variables to mean multiplication, goes a
128 long way to allowing the numerical scientist to write code
129 that more closely resembles the compact mathematical notation
130 of the equations that usually lie behind it. julia ε₁ = 0.01
131
132 0.01
133
134 julia ε₂ = 0.02
135
136 0.02
137
138 julia 2ε₁ + 3ε₂
139
140 0.08
141
142 And where does Julia come down on the age-old debate of what
143 do about 1/2 ? In Fortran and Python 2, this will get you 0,
144 since 1 and 2 are integers, and the result is rounded down to
145 the integer 0. This was deemed inconsistent, and confusing to
146 some, so it was changed in Python 3 to return 0.5 — which is
147 what you get in Julia, too.
148
149 While we're on the subject of fractions, Julia can handle
150 rational numbers, with a special syntax: 3//5 + 2//3 returns
151 19//15 , while 3/5 + 2/3 gets you the floating-point answer
152 1.2666666666666666. Internally, Julia thinks of a rational
153 number in its reduced form, so the expression 6//8 == 3//4
154 returns true , and numerator(6//8) returns 3 .
155
156 Arrays
157
158 Arrays are enclosed in square brackets and indexed with an
159 iterator that can contain a step value: julia a = [1, 2, 3,
160 4, 5, 6]
161
162 6-element Array{Int64,1}:
163
164 1
165
166 2
167
168 3
169
170 4
171
172 5
173
174 6
175
176 julia a[1:2:end]
177
178 3-element Array{Int64,1}:
179
180 1
181
182 3
183
184 5
185
186 As you can see, indexing starts at one, and the useful end
187 index means the obvious thing. When you define a variable in
188 the REPL, Julia replies with the type and value of the
189 assigned data; you can suppress this output by ending your
190 input line with a semicolon.
191
192 Since arrays are such a vital part of numerical computation,
193 and Julia makes them easy to work with, we'll spend a bit more
194 time with them than the other data structures.
195
196 To illustrate the syntax, we can start with a couple of 2D
197 arrays, defined at the REPL: julia a = [1 2 3; 4 5 6]
198
199 2×3 Array{Int64,2}:
200
201 1 2 3
202
203 4 5 6
204
205 julia z = [-1 -2 -3; -4 -5 -6];
206
207 Indexing is as expected: julia a[1, 2]
208
209 2
210
211 You can glue arrays together horizontally: julia [a z]
212
213 2×6 Array{Int64,2}:
214
215 1 2 3 -1 -2 -3
216
217 4 5 6 -4 -5 -6
218
219 And vertically: julia [a; z]
220
221 4×3 Array{Int64,2}:
222
223 1 2 3
224
225 4 5 6
226
227 -1 -2 -3
228
229 -4 -5 -6
230
231 Julia has all the usual operators for handling arrays, and
232 [18]linear algebra functions that work with matrices (2D
233 arrays). The linear algebra functions are part of Julia's
234 standard library, but need to be imported with a command like
235 " using LinearAlgebra ", which is a detail omitted from the
236 current documentation. The functions include such things as
237 determinants, matrix inverses, eigenvalues and eigenvectors,
238 many kinds of matrix factorizations, etc. Julia has not
239 reinvented the wheel here, but wisely uses the [19]LAPACK
240 Fortran library of battle-tested linear algebra routines.
241
242 The extension of arithmetic operators to arrays is usually
243 intuitive: julia a + z
244
245 2×3 Array{Int64,2}:
246
247 0 0 0
248
249 0 0 0
250
251 And the numerical prepending syntax works with arrays, too:
252 julia 3a + 4z
253
254 2×3 Array{Int64,2}:
255
256 -1 -2 -3
257
258 -4 -5 -6
259
260 Putting a multiplication operator between two matrices gets
261 you matrix multiplication: julia a * transpose(a)
262
263 2×2 Array{Int64,2}:
264
265 14 32
266
267 32 77
268
269 You can "broadcast" numbers to cover all the elements in an
270 array by prepending the usual arithmetic operators with a dot:
271 julia 1 .+ a
272
273 2×3 Array{Int64,2}:
274
275 2 3 4
276
277 5 6 7
278
279 Note that the language only actually requires the dot for some
280 operators, but not for others, such as "*" and "/". The
281 reasons for this are arcane, and it probably makes sense to be
282 consistent and use the dot whenever you intend broadcasting.
283 Note also that the current version of the official
284 documentation is incorrect in claiming that you may omit the
285 dot from "+" and "-"; in fact, this now gives an error.
286
287 You can use the dot notation to turn any function into one
288 that operates on each element of an array: julia
289 round.(sin.([0, π/2, π, 3π/2, 2π]))
290
291 5-element Array{Float64,1}:
292
293 0.0
294
295 1.0
296
297 0.0
298
299 -1.0
300
301 -0.0
302
303 The example above illustrates chaining two dotted functions
304 together. The Julia compiler turns expressions like this into
305 "fused" operations: instead of applying each function in turn
306 to create a new array that is passed to the next function, the
307 compiler combines the functions into a single compound
308 function that is applied once over the array, creating a
309 significant optimization.
310
311 You can use this dot notation with any function, including
312 your own, to turn it into a version that operates element-wise
313 over arrays.
314
315 Dictionaries (associative arrays) can be defined with several
316 syntaxes. Here's one: julia d1 = Dict("A"=1, "B"=2)
317
318 Dict{String,Int64} with 2 entries:
319
320 "B" = 2
321
322 "A" = 1
323
324 You may have noticed that the code snippets so far have not
325 included any type declarations. Every value in Julia has a
326 type, but the compiler will infer types if they are not
327 specified. It is generally not necessary to declare types for
328 performance, but type declarations sometimes serve other
329 purposes, that we'll return to later. Julia has a deep and
330 sophisticated type system, including user-defined types and
331 C-like structs. Types can have behaviors associated with them,
332 and can inherit behaviors from other types. The best thing
333 about Julia's type system is that you can ignore it entirely,
334 use just a few pieces of it, or spend weeks studying its
335 design.
336
337 Control flow
338
339 Julia code is organized in blocks, which can indicate control
340 flow, function definitions, and other code units. Blocks are
341 terminated with the end keyword, and indentation is not
342 significant. Statements are separated either with newlines or
343 semicolons.
344
345 Julia has the typical control flow constructs; here is a while
346 block: julia i = 1;
347
348 julia while i 5
349
350 print(i)
351
352 global i = i + 1
353
354 end
355
356 1234
357
358 Notice the global keyword. Most blocks in Julia introduce a
359 local scope for variables; without this keyword here, we would
360 get an error about an undefined variable.
361
362 Julia has the usual if statements and for loops that use the
363 same iterators that we introduced above for array indexing. We
364 can also iterate over collections: julia for i ∈ ['a', 'b',
365 'c']
366
367 println(i)
368
369 end
370
371 a
372
373 b
374
375 c
376
377 In place of the fancy math symbol in this for loop, we can use
378 " = " or " in ". If you want to use the math symbol but have
379 no convenient way to type it, the REPL will help you: type "
380 \in " and the TAB key, and the symbol appears; you can type
381 many [20]LaTeX expressions into the REPL in this way.
382
383 Development of Julia
384
385 The language is developed on GitHub, with over 700
386 contributors. The Julia team mentioned in their email to us
387 that the decision to use GitHub has been particularly good for
388 Julia, as it streamlined the process for many of their
389 contributors, who are scientists or domain experts in various
390 fields, rather than professional software developers.
391
392 The creators of Julia have [21]published [PDF] a detailed
393 “mission statement” for the language, describing their aims
394 and motivations. A key issue that they wanted their language
395 to solve is what they called the "two-language problem." This
396 situation is familiar to anyone who has used Python or another
397 dynamic language on a demanding numerical problem. To get good
398 performance, you will wind up rewriting the numerically
399 intensive parts of the program in C or Fortran, dealing with
400 the interface between the two languages, and may still be
401 disappointed in the overhead presented by calling the foreign
402 routines from your original code.
403
404 For Python, [22]NumPy and SciPy wrap many numerical routines,
405 written in Fortran or C, for efficient use from that language,
406 but you can only take advantage of this if your calculation
407 fits the pattern of an available routine; in more general
408 cases, where you will have to write a loop over your data, you
409 are stuck with Python's native performance, which is orders of
410 magnitude slower. If you switch to an alternative, faster
411 implementation of Python, such as [23]PyPy , the numerical
412 libraries may not be compatible; NumPy became available for
413 PyPy only within about the past year.
414
415 Julia solves the two-language problem by being as expressive
416 and simple to program in as a dynamic scripting language,
417 while having the native performance of a static, compiled
418 language. There is no need to write numerical libraries in a
419 second language, but C or Fortran library routines can be
420 called using a facility that Julia has built-in. Other
421 languages, such as [24]Python or [25]R , can also interoperate
422 easily with Julia using external packages.
423
424 Documentation
425
426 There are many resources to turn to to learn the language.
427 There is an extensive and detailed [26]manual at Julia
428 headquarters, and this may be a good place to start. However,
429 although the first few chapters provide a gentle introduction,
430 the material soon becomes dense and, at times, hard to follow,
431 with references to concepts that are not explained until later
432 chapters. Fortunately, there is a [27]"learning" link at the
433 top of the Julia home page, which takes you to a long list of
434 videos, tutorials, books, articles, and classes both about
435 Julia and that use Julia in teaching subjects such a numerical
436 analysis. There is also a fairly good [28]cheat-sheet [PDF] ,
437 which was just updated for v. 1.0.
438
439 If you're coming from Python, [29]this list of noteworthy
440 differences between Python and Julia syntax will probably be
441 useful.
442
443 Some of the linked tutorials are in the form of [30]Jupyter
444 notebooks — indeed, the name "Jupyter" is formed from "Julia",
445 "Python", and "R", which are the three original languages
446 supported by the interface. The [31]Julia kernel for Jupyter
447 was recently upgraded to support v. 1.0. Judicious sampling of
448 a variety of documentation sources, combined with liberal
449 experimentation, may be the best way of learning the language.
450 Jupyter makes this experimentation more inviting for those who
451 enjoy the web-based interface, but the REPL that comes with
452 Julia helps a great deal in this regard by providing, for
453 instance, TAB completion and an extensive help system invoked
454 by simply pressing the "?" key.
455
456 Stay tuned
457
458 The [32]next installment in this two-part series will explain
459 how Julia is organized around the concept of "multiple
460 dispatch". You will learn how to create functions and make
461 elementary use of Julia's type system. We'll see how to
462 install packages and use modules, and how to make graphs.
463 Finally, Part 2 will briefly survey the important topics of
464 macros and distributed computing.
465
466 [33]Comments (80 posted)
467
468 [34]C considered dangerous
469
470 By Jake Edge
471
472 August 29, 2018
473
474 [35]LSS NA
475
476 At the North America edition of the [36]2018 Linux Security
477 Summit (LSS NA), which was held in late August in Vancouver,
478 Canada, Kees Cook gave a presentation on some of the dangers
479 that come with programs written in C. In particular, of
480 course, the Linux kernel is mostly written in C, which means
481 that the security of our systems rests on a somewhat dangerous
482 foundation. But there are things that can be done to help firm
483 things up by " Making C Less Dangerous " as the title of his
484 talk suggested.
485
486 He began with a brief summary of the work that he and others
487 are doing as part of the [37]Kernel Self Protection Project
488 (KSPP). The goal of the project is to get kernel protections
489 merged into the mainline. These protections are not targeted
490 at protecting user-space processes from other (possibly rogue)
491 processes, but are, instead, focused on protecting the kernel
492 from user-space code. There are around 12 organizations and
493 ten individuals working on roughly 20 different technologies
494 as part of the KSPP, he said. The progress has been "slow and
495 steady", he said, which is how he thinks it should go. [38]
496
497 One of the main problems is that C is treated mostly like a
498 fancy assembler. The kernel developers do this because they
499 want the kernel to be as fast and as small as possible. There
500 are other reasons, too, such as the need to do
501 architecture-specific tasks that lack a C API (e.g. setting up
502 page tables, switching to 64-bit mode).
503
504 But there is lots of undefined behavior in C. This
505 "operational baggage" can lead to various problems. In
506 addition, C has a weak standard library with multiple utility
507 functions that have various pitfalls. In C, the content of
508 uninitialized automatic variables is undefined, but in the
509 machine code that it gets translated to, the value is whatever
510 happened to be in that memory location before. In C, a
511 function pointer can be called even if the type of the pointer
512 does not match the type of the function being called—assembly
513 doesn't care, it just jumps to a location, he said.
514
515 The APIs in the standard library are also bad in many cases.
516 He asked: why is there no argument to memcpy() to specify the
517 maximum destination length? He noted a recent [39]blog post
518 from Raph Levien entitled "With Undefined Behavior, Anything
519 is Possible". That obviously resonated with Cook, as he
520 pointed out his T-shirt—with the title and artwork from the
521 post.
522
523 Less danger
524
525 He then moved on to some things that kernel developers can do
526 (and are doing) to get away from some of the dangers of C. He
527 began with variable-length arrays (VLAs), which can be used to
528 overflow the stack to access data outside of its region. Even
529 if the stack has a guard page, VLAs can be used to jump past
530 it to write into other memory, which can then be used by some
531 other kind of attack. The C language is "perfectly fine with
532 this". It is easy to find uses of VLAs with the -Wvla flag,
533 however.
534
535 But it turns out that VLAs are [40]not just bad from a
536 security perspective , they are also slow. In a
537 micro-benchmark associated with a [41]patch removing a VLA , a
538 13% performance boost came from using a fixed-size array. He
539 dug in a bit further and found that much more code is being
540 generated to handle a VLA, which explains the speed increase.
541 Since Linus Torvalds has [42]declared that VLAs should be
542 removed from the kernel because they cause security problems
543 and also slow the kernel down; Cook said "don't use VLAs".
544
545 Another problem area is switch statements, in particular where
546 there is no break for a case . That could mean that the
547 programmer expects and wants to fall through to the next case
548 or it could be that the break was simply forgotten. There is a
549 way to get a warning from the compiler for fall-throughs, but
550 there needs to be a way to mark those that are truly meant to
551 be that way. A special fall-through "statement" in the form of
552 a comment is what has been agreed on within the
553 static-analysis community. He and others have been going
554 through each of the places where there is no break to add
555 these comments (or a break ); they have "found a lot of bugs
556 this way", he said.
557
558 Uninitialized local variables will generate a warning, but not
559 if the variable is passed in by reference. There are some GCC
560 plugins that will automatically initialize these variables,
561 but there are also patches for both GCC and Clang to provide a
562 compiler option to do so. Neither of those is upstream yet,
563 but Torvalds has praised the effort so the kernel would likely
564 use the option. An interesting side effect that came about
565 while investigating this was a warning he got about
566 unreachable code when he enabled the auto-initialization.
567 There were two variables declared just after a switch (and
568 outside of any case ), where they would never be reached.
569
570 Arithmetic overflow is another undefined behavior in C that
571 can cause various problems. GCC can check for signed overflow,
572 which performs well (the overhead is in the noise, he said),
573 but adding warning messages for it does grow the kernel by 6%;
574 making the overflow abort, instead, only adds 0.1%. Clang can
575 check for both signed and unsigned overflow; signed overflow
576 is undefined, while unsigned overflow is defined, but often
577 unexpected. Marking places where unsigned overflow is expected
578 is needed; it would be nice to get those annotations put into
579 the kernel, Cook said.
580
581 Explicit bounds checking is expensive. Doing it for
582 copy_{to,from}_user() is a less than 1% performance hit, but
583 adding it to the strcpy() and memcpy() families are around a
584 2% hit. Pre-Meltdown that would have been a totally impossible
585 performance regression for security, he said; post-Meltdown,
586 since it is less than 5%, maybe there is a chance to add this
587 checking.
588
589 Better APIs would help as well. He pointed to the evolution of
590 strcpy() , through str n cpy() and str l cpy() (each with
591 their own bounds flaws) to str s cpy() , which seems to be "OK
592 so far". He also mentioned memcpy() again as a poor API with
593 respect to bounds checking.
594
595 Hardware support for bounds checking is available in the
596 application data integrity (ADI) feature for SPARC and is
597 coming for Arm; it may also be available for Intel processors
598 at some point. These all use a form of "memory tagging", where
599 allocations get a tag that is stored in the high-order byte of
600 the address. An offset from the address can be checked by the
601 hardware to see if it still falls within the allocated region
602 based on the tag.
603
604 Control-flow integrity (CFI) has become more of an issue
605 lately because much of what attackers had used in the past has
606 been marked as "no execute" so they are turning to using
607 existing code "gadgets" already present in the kernel by
608 hijacking existing indirect function calls. In C, you can just
609 call pointers without regard to the type as it just treats
610 them as an address to jump to. Clang has a CFI-sanitize
611 feature that enforces the function prototype to restrict the
612 calls that can be made. It is done at runtime and is not
613 perfect, in part because there are lots of functions in the
614 kernel that take one unsigned long parameter and return an
615 unsigned long.
616
617 Attacks on CFI have both a "forward edge", which is what CFI
618 sanitize tries to handle, and a "backward edge" that comes
619 from manipulating the stack values, the return address in
620 particular. Clang has two methods available to prevent the
621 stack manipulation. The first is the "safe stack", which puts
622 various important items (e.g. "safe" variables, register
623 spills, and the return address) on a separate stack.
624 Alternatively, the "shadow stack" feature creates a separate
625 stack just for return addresses.
626
627 One problem with these other stacks is that they are still
628 writable, so if an attacker can find them in memory, they can
629 still perform their attacks. Hardware-based protections, like
630 Intel's Control-Flow Enforcement Technology (CET),
631 [43]provides a read-only shadow call stack for return
632 addresses. Another hardware protection is [44]pointer
633 authentication for Arm, which adds a kind of encrypted tag to
634 the return address that can be verified before it is used.
635
636 Status and challenges
637
638 Cook then went through the current status of handling these
639 different problems in the kernel. VLAs are almost completely
640 gone, he said, just a few remain in the crypto subsystem; he
641 hopes those VLAs will be gone by 4.20 (or whatever the number
642 of the next kernel release turns out to be). Once that
643 happens, he plans to turn on -Wvla for the kernel build so
644 that none creep back in.
645
646 There has been steady progress made on marking fall-through
647 cases in switch statements. Only 745 remain to be handled of
648 the 2311 that existed when this work started; each one
649 requires scrutiny to determine what the author's intent is.
650 Auto-initialized local variables can be done using compiler
651 plugins, but that is "not quite what we want", he said. More
652 compiler support would be helpful there. For arithmetic
653 overflow, it would be nice to see GCC get support for the
654 unsigned case, but memory allocations are now doing explicit
655 overflow checking at this point.
656
657 Bounds checking has seen some "crying about performance hits",
658 so we are waiting impatiently for hardware support, he said.
659 CFI forward-edge protection needs [45]link-time optimization
660 (LTO) support for Clang in the kernel, but it is currently
661 working on Android. For backward-edge mitigation, the Clang
662 shadow call stack is working on Android, but we are
663 impatiently waiting for hardware support for that too.
664
665 There are a number of challenges in doing security development
666 for the kernel, Cook said. There are cultural boundaries due
667 to conservatism within the kernel community; that requires
668 patiently working and reworking features in order to get them
669 upstream. There are, of course, technical challenges because
670 of the complexity of security changes; those kinds of problems
671 can be solved. There are also resource limitations in terms of
672 developers, testers, reviewers, and so on. KSPP and the other
673 kernel security developers are still making that "slow but
674 steady" progress.
675
676 Cook's [46]slides [PDF] are available for interested readers;
677 before long, there should be a video available of the talk as
678 well.
679
680 [I would like to thank LWN's travel sponsor, the Linux
681 Foundation, for travel assistance to attend the Linux Security
682 Summit in Vancouver.]
683
684 [47]Comments (70 posted)
685
686 [48]The second half of the 4.19 merge window
687
688 By Jonathan Corbet
689
690 August 26, 2018 By the time Linus Torvalds [49]released
691 4.19-rc1 and closed the merge window for this development
692 cycle, 12,317 non-merge changesets had found their way into
693 the mainline; about 4,800 of those landed after [50]last
694 week's summary was written. As tends to be the case late in
695 the merge window, many of those changes were fixes for the
696 bigger patches that went in early, but there were also a
697 number of new features added. Some of the more significant
698 changes include:
699
700 Core kernel
701
702 The full set of patches adding [51]control-group awareness to
703 the out-of-memory killer has not been merged due to ongoing
704 disagreements, but one piece of it has: there is a new
705 memory.oom.group control knob that will cause all processes
706 within a control group to be killed in an out-of-memory
707 situation.
708
709 A new set of protections has been added to prevent an attacker
710 from fooling a program into writing to an existing file or
711 FIFO. An open with the O_CREAT flag to a file or FIFO in a
712 world-writable, sticky directory (e.g. /tmp ) will fail if the
713 owner of the opening process is not the owner of either the
714 target file or the containing directory. This behavior,
715 disabled by default, is controlled by the new
716 protected_regular and protected_fifos sysctl knobs.
717
718 Filesystems and block layer
719
720 The dm-integrity device-mapper target can now use a separate
721 device for metadata storage.
722
723 EROFS, the "enhanced read-only filesystem", has been added to
724 the staging tree. It is " a lightweight read-only file system
725 with modern designs (eg. page-sized blocks, inline
726 xattrs/data, etc.) for scenarios which need high-performance
727 read-only requirements, eg. firmwares in mobile phone or
728 LIVECDs "
729
730 The new "metadata copy-up" feature in overlayfs will avoid
731 copying a file's contents to the upper layer on a
732 metadata-only change. See [52]this commit for details.
733
734 Hardware support
735
736 Graphics : Qualcomm Adreno A6xx GPUs.
737
738 Industrial I/O : Spreadtrum SC27xx series PMIC
739 analog-to-digital converters, Analog Devices AD5758
740 digital-to-analog converters, Intersil ISL29501 time-of-flight
741 sensors, Silicon Labs SI1133 UV index/ambient light sensor
742 chips, and Bosch Sensortec BME680 sensors.
743
744 Miscellaneous : Generic ADC-based resistive touchscreens,
745 Generic ASIC devices via the Google [53]Gasket framework ,
746 Analog Devices ADGS1408/ADGS1409 multiplexers, Actions Semi
747 Owl SoCs DMA controllers, MEN 16Z069 watchdog timers, Rohm
748 BU21029 touchscreen controllers, Cirrus Logic CS47L35,
749 CS47L85, CS47L90, and CS47L91 codecs, Cougar 500k gaming
750 keyboards, Qualcomm GENI-based I2C controllers, Actions
751 Semiconductor Owl I2C controllers, ChromeOS EC-based USBPD
752 chargers, and Analog Devices ADP5061 battery chargers.
753
754 USB : Nuvoton NPCM7XX on-chip EHCI USB controllers, Broadcom
755 Stingray PCIe PHYs, and Renesas R-Car generation 3 PCIe PHYs.
756
757 There is also a new subsystem for the abstraction of GNSS
758 (global navigation satellite systems — GPS, for example)
759 receivers in the kernel. To date, such devices have been
760 handled with an abundance of user-space drivers; the hope is
761 to bring some order in this area. Support for u-blox and
762 SiRFstar receivers has been added as well.
763
764 Kernel internal
765
766 The __deprecated marker, used to mark interfaces that should
767 no longer be used, has been deprecated and removed from the
768 kernel entirely. [54]Torvalds said : " They are not useful.
769 They annoy everybody, and nobody ever does anything about
770 them, because it's always 'somebody elses problem'. And when
771 people start thinking that warnings are normal, they stop
772 looking at them, and the real warnings that mean something go
773 unnoticed. "
774
775 The minimum version of GCC required by the kernel has been
776 moved up to 4.6.
777
778 There are a couple of significant changes that failed to get
779 in this time around, including the [55]XArray data structure.
780 The patches are thought to be ready, but they had the bad luck
781 to be based on a tree that failed to be merged for other
782 reasons, so Torvalds [56]didn't even look at them . That, in
783 turn, blocks another set of patches intended to enable
784 migration of slab-allocated objects.
785
786 The other big deferral is the [57]new system-call API for
787 filesystem mounting . Despite ongoing [58]concerns about what
788 happens when the same low-level device is mounted multiple
789 times with conflicting options, Al Viro sent [59]a pull
790 request to send this work upstream. The ensuing discussion
791 made it clear that there is still not a consensus in this
792 area, though, so it seems that this work has to wait for
793 another cycle.
794
795 Assuming all goes well, the kernel will stabilize over the
796 coming weeks and the final 4.19 release will happen in
797 mid-October.
798
799 [60]Comments (1 posted)
800
801 [61]Measuring (and fixing) I/O-controller throughput loss
802
803 August 29, 2018
804
805 This article was contributed by Paolo Valente
806
807 Many services, from web hosting and video streaming to cloud
808 storage, need to move data to and from storage. They also
809 often require that each per-client I/O flow be guaranteed a
810 non-zero amount of bandwidth and a bounded latency. An
811 expensive way to provide these guarantees is to over-provision
812 storage resources, keeping each resource underutilized, and
813 thus have plenty of bandwidth available for the few I/O flows
814 dispatched to each medium. Alternatively one can use an I/O
815 controller. Linux provides two mechanisms designed to throttle
816 some I/O streams to allow others to meet their bandwidth and
817 latency requirements. These mechanisms work, but they come at
818 a cost: a loss of as much as 80% of total available I/O
819 bandwidth. I have run some tests to demonstrate this problem;
820 some upcoming improvements to the [62]bfq I/O scheduler
821 promise to improve the situation considerably.
822
823 Throttling does guarantee control, even on drives that happen
824 to be highly utilized but, as will be seen, it has a hard time
825 actually ensuring that drives are highly utilized. Even with
826 greedy I/O flows, throttling easily ends up utilizing as
827 little as 20% of the available speed of a flash-based drive.
828 Such a speed loss may be particularly problematic with
829 lower-end storage. On the opposite end, it is also
830 disappointing with high-end hardware, as the Linux block I/O
831 stack itself has been [63]redesigned from the ground up to
832 fully utilize the high speed of modern, fast storage. In
833 addition, throttling fails to guarantee the expected
834 bandwidths if I/O contains both reads and writes, or is
835 sporadic in nature.
836
837 On the bright side, there now seems to be an effective
838 alternative for controlling I/O: the proportional-share policy
839 provided by the bfq I/O scheduler. It enables nearly 100%
840 storage bandwidth utilization, at least with some of the
841 workloads that are problematic for throttling. An upcoming
842 version of bfq may be able to achieve this result with almost
843 all workloads. Finally, bfq guarantees bandwidths with all
844 workloads. The current limitation of bfq is that its execution
845 overhead becomes significant at speeds above 400,000 I/O
846 operations per second on commodity CPUs.
847
848 Using the bfq I/O scheduler, Linux can now guarantee low
849 latency to lightweight flows containing sporadic, short I/O.
850 No throughput issues arise, and no configuration is required.
851 This capability benefits important, time-sensitive tasks, such
852 as video or audio streaming, as well as executing commands or
853 starting applications. Although benchmarks are not available
854 yet, these guarantees might also be provided by the newly
855 proposed [64]I/O latency controller . It allows administrators
856 to set target latencies for I/O requests originating from each
857 group of processes, and favors the groups with the lowest
858 target latency.
859
860 The testbed
861
862 I ran the tests with an ext4 filesystem mounted on a PLEXTOR
863 PX-256M5S SSD, which features a peak rate of ~160MB/s with
864 random I/O, and of ~500MB/s with sequential I/O. I used
865 blk-mq, in Linux 4.18. The system was equipped with a 2.4GHz
866 Intel Core i7-2760QM CPU and 1.3GHz DDR3 DRAM. In such a
867 system, a single thread doing synchronous reads reaches a
868 throughput of 23MB/s.
869
870 For the purposes of these tests, each process is considered to
871 be in one of two groups, termed "target" and "interferers". A
872 target is a single-process, I/O-bound group whose I/O is
873 focused on. In particular, I measure the I/O throughput
874 enjoyed by this group to get the minimum bandwidth delivered
875 to the group. An interferer is single-process group whose role
876 is to generate additional I/O that interferes with the I/O of
877 the target. The tested workloads contain one target and
878 multiple interferers.
879
880 The single process in each group either reads or writes,
881 through asynchronous (buffered) operations, to one file —
882 different from the file read or written by any other process —
883 after invalidating the buffer cache for the file. I define a
884 reader or writer process as either "random" or "sequential",
885 depending on whether it reads or writes its file at random
886 positions or sequentially. Finally, an interferer is defined
887 as being either "active" or "inactive" depending on whether it
888 performs I/O during the test. When an interferer is mentioned,
889 it is assumed that the interferer is active.
890
891 Workloads are defined so as to try to cover the combinations
892 that, I believe, most influence the performance of the storage
893 device and of the I/O policies. For brevity, in this article I
894 show results for only two groups of workloads:
895
896 Static sequential : four synchronous sequential readers or
897 four asynchronous sequential writers, plus five inactive
898 interferers.
899
900 Static random : four synchronous random readers, all with a
901 block size equal to 4k, plus five inactive interferers.
902
903 To create each workload, I considered, for each mix of
904 interferers in the group, two possibilities for the target: it
905 could be either a random or a sequential synchronous reader.
906 In [65]a longer version of this article [PDF] , you will also
907 find results for workloads with varying degrees of I/O
908 randomness, and for dynamic workloads (containing sporadic I/O
909 sources). These extra results confirm the losses of throughput
910 and I/O control for throttling that are shown here.
911
912 I/O policies
913
914 Linux provides two I/O-control mechanisms for guaranteeing (a
915 minimum) bandwidth, or at least fairness, to long-lived flows:
916 the throttling and proportional-share I/O policies. With
917 throttling, one can set a maximum bandwidth limit — "max
918 limit" for brevity — for the I/O of each group. Max limits can
919 be used, in an indirect way, to provide the service guarantee
920 at the focus of this article. For example, to guarantee
921 minimum bandwidths to I/O flows, a group can be guaranteed a
922 minimum bandwidth by limiting the maximum bandwidth of all the
923 other groups.
924
925 Unfortunately, max limits have two drawbacks in terms of
926 throughput. First, if some groups do not use their allocated
927 bandwidth, that bandwidth cannot be reclaimed by other active
928 groups. Second, limits must comply with the worst-case speed
929 of the device, namely, its random-I/O peak rate. Such limits
930 will clearly leave a lot of throughput unused with workloads
931 that otherwise would drive the device to higher throughput
932 levels. Maximizing throughput is simply not a goal of max
933 limits. So, for brevity, test results with max limits are not
934 shown here. You can find these results, plus a more detailed
935 description of the above drawbacks, in the long version of
936 this article.
937
938 Because of these drawbacks, a new, still experimental, low
939 limit has been added to the throttling policy. If a group is
940 assigned a low limit, then the throttling policy automatically
941 limits the I/O of the other groups in such a way to guarantee
942 to the group a minimum bandwidth equal to its assigned low
943 limit. This new throttling mechanism throttles no group as
944 long as every group is getting at least its assigned minimum
945 bandwidth. I tested this mechanism, but did not consider the
946 interesting problem of guaranteeing minimum bandwidths while,
947 at the same time, enforcing maximum bandwidths.
948
949 The other I/O policy available in Linux, proportional share,
950 provides weighted fairness. Each group is assigned a weight,
951 and should receive a portion of the total throughput
952 proportional to its weight. This scheme guarantees minimum
953 bandwidths in the same way that low limits do in throttling.
954 In particular, it guarantees to each group a minimum bandwidth
955 equal to the ratio between the weight of the group, and the
956 sum of the weights of all the groups that may be active at the
957 same time.
958
959 The actual implementation of the proportional-share policy, on
960 a given drive, depends on what flavor of the block layer is in
961 use for that drive. If the drive is using the legacy block
962 interface, the policy is implemented by the cfq I/O scheduler.
963 Unfortunately, cfq fails to control bandwidths with
964 flash-based storage, especially on drives featuring command
965 queueing. This case is not considered in these tests. With
966 drives using the multiqueue interface, proportional share is
967 implemented by bfq. This is the combination considered in the
968 tests.
969
970 To benchmark both throttling (low limits) and proportional
971 share, I tested, for each workload, the combinations of I/O
972 policies and I/O schedulers reported in the table below. In
973 the end, there are three test cases for each workload. In
974 addition, for some workloads, I considered two versions of bfq
975 for the proportional-share policy.
976
977 Name
978
979 I/O policy
980
981 Scheduler
982
983 Parameter for target
984
985 Parameter for each of the four active interferers
986
987 Parameter for each of the five inactive interferers
988
989 Sum of parameters
990
991 low-none
992
993 Throttling with low limits
994
995 none
996
997 10MB/s
998
999 10MB/s (tot: 40)
1000
1001 20MB/s (tot: 100)
1002
1003 150MB/s
1004
1005 prop-bfq
1006
1007 Proportional share
1008
1009 bfq
1010
1011 300
1012
1013 100 (tot: 400)
1014
1015 200 (tot: 1000)
1016
1017 1700
1018
1019 For low limits, I report results with only none as the I/O
1020 scheduler, because the results are the same with kyber and
1021 mq-deadline.
1022
1023 The capabilities of the storage medium and of low limits drove
1024 the policy configurations. In particular:
1025
1026 The configuration of the target and of the active interferers
1027 for low-none is the one for which low-none provides its best
1028 possible minimum-bandwidth guarantee to the target: 10MB/s,
1029 guaranteed if all interferers are readers. Results remain the
1030 same regardless of the values used for target latency and idle
1031 time; I set them to 100µs and 1000µs, respectively, for every
1032 group.
1033
1034 Low limits for inactive interferers are set to twice the
1035 limits for active interferers, to pose greater difficulties to
1036 the policy.
1037
1038 I chose weights for prop-bfq so as to guarantee about the same
1039 minimum bandwidth as low-none to the target, in the same
1040 only-reader worst case as for low-none and to preserve,
1041 between the weights of active and inactive interferers, the
1042 same ratio as between the low limits of active and inactive
1043 interferers.
1044
1045 Full details on configurations can be found in the long
1046 version of this article.
1047
1048 Each workload was run ten times for each policy, plus ten
1049 times without any I/O control, i.e., with none as I/O
1050 scheduler and no I/O policy in use. For each run, I measured
1051 the I/O throughput of the target (which reveals the bandwidth
1052 provided to the target), the cumulative I/O throughput of the
1053 interferers, and the total I/O throughput. These quantities
1054 fluctuated very little during each run, as well as across
1055 different runs. Thus in the graphs I report only averages over
1056 per-run average throughputs. In particular, for the case of no
1057 I/O control, I report only the total I/O throughput, to give
1058 an idea of the throughput that can be reached without imposing
1059 any control.
1060
1061 Results
1062
1063 This plot shows throughput results for the simplest group of
1064 workloads: the static-sequential set.
1065
1066 With a random reader as the target against sequential readers
1067 as interferers, low-none does guarantee the configured low
1068 limit to the target. Yet it reaches only a low total
1069 throughput. The throughput of the random reader evidently
1070 oscillates around 10MB/s during the test. This implies that it
1071 is at least slightly below 10MB/s for a significant percentage
1072 of the time. But when this happens, the low-limit mechanism
1073 limits the maximum bandwidth of every active group to the low
1074 limit set for the group, i.e., to just 10MB/s. The end result
1075 is a total throughput lower than 10% of the throughput reached
1076 without I/O control.
1077
1078 That said, the high throughput achieved without I/O control is
1079 obtained by choking the random I/O of the target in favor of
1080 the sequential I/O of the interferers. Thus, it is probably
1081 more interesting to compare low-none throughput with the
1082 throughput reachable while actually guaranteeing 10MB/s to the
1083 target. The target is a single, synchronous, random reader,
1084 which reaches 23MB/s while active. So, to guarantee 10MB/s to
1085 the target, it is enough to serve it for about half of the
1086 time, and the interferers for the other half. Since the device
1087 reaches ~500MB/s with the sequential I/O of the interferers,
1088 the resulting throughput with this service scheme would be
1089 (500+23)/2, or about 260MB/s. low-none thus reaches less than
1090 20% of the total throughput that could be reached while still
1091 preserving the target bandwidth.
1092
1093 prop-bfq provides the target with a slightly higher throughput
1094 than low-none. This makes it harder for prop-bfq to reach a
1095 high total throughput, because prop-bfq serves more random I/O
1096 (from the target) than low-none. Nevertheless, prop-bfq gets a
1097 much higher total throughput than low-none. According to the
1098 above estimate, this throughput is about 90% of the maximum
1099 throughput that could be reached, for this workload, without
1100 violating service guarantees. The reason for this good result
1101 is that bfq provides an effective implementation of the
1102 proportional-share service policy. At any time, each active
1103 group is granted a fraction of the current total throughput,
1104 and the sum of these fractions is equal to one; so group
1105 bandwidths naturally saturate the available total throughput
1106 at all times.
1107
1108 Things change with the second workload: a random reader
1109 against sequential writers. Now low-none reaches a much higher
1110 total throughput than prop-bfq. low-none serves much more
1111 sequential (write) I/O than prop-bfq because writes somehow
1112 break the low-limit mechanisms and prevail over the reads of
1113 the target. Conceivably, this happens because writes tend to
1114 both starve reads in the OS (mainly by eating all available
1115 I/O tags) and to cheat on their completion time in the drive.
1116 In contrast, bfq is intentionally configured to privilege
1117 reads, to counter these issues.
1118
1119 In particular, low-none gets an even higher throughput than no
1120 I/O control at all because it penalizes the random I/O of the
1121 target even more than the no-controller configuration.
1122
1123 Finally, with the last two workloads, prop-bfq reaches even
1124 higher total throughput than with the first two. It happens
1125 because the target also does sequential I/O, and serving
1126 sequential I/O is much more beneficial for throughput than
1127 serving random I/O. With these two workloads, the total
1128 throughput is, respectively, close to or much higher than that
1129 reached without I/O control. For the last workload, the total
1130 throughput is much higher because, differently from none, bfq
1131 privileges reads over asynchronous writes, and reads yield a
1132 higher throughput than writes. In contrast, low-none still
1133 gets lower or much lower throughput than prop-bfq, because of
1134 the same issues that hinder low-none throughput with the first
1135 two workloads.
1136
1137 As for bandwidth guarantees, with readers as interferers
1138 (third workload), prop-bfq, as expected, gives the target a
1139 fraction of the total throughput proportional to its weight.
1140 bfq approximates perfect proportional-share bandwidth
1141 distribution among groups doing I/O of the same type (reads or
1142 writes) and with the same locality (sequential or random).
1143 With the last workload, prop-bfq gives much more throughput to
1144 the reader than to all the interferers, because interferers
1145 are asynchronous writers, and bfq privileges reads.
1146
1147 The second group of workloads (static random), is the one,
1148 among all the workloads considered, for which prop-bfq
1149 performs worst. Results are shown below:
1150
1151 This chart reports results not only for mainline bfq, but also
1152 for an improved version of bfq which is currently under public
1153 testing. As can be seen, with only random readers, prop-bfq
1154 reaches a much lower total throughput than low-none. This
1155 happens because of the Achilles heel of the bfq I/O scheduler.
1156 If the process in service does synchronous I/O and has a
1157 higher weight than some other process, then, to give strong
1158 bandwidth guarantees to that process, bfq plugs I/O
1159 dispatching every time the process temporarily stops issuing
1160 I/O requests. In this respect, processes actually have
1161 differentiated weights and do synchronous I/O in the workloads
1162 tested. So bfq systematically performs I/O plugging for them.
1163 Unfortunately, this plugging empties the internal queues of
1164 the drive, which kills throughput with random I/O. And the I/O
1165 of all processes in these workloads is also random.
1166
1167 The situation reverses with a sequential reader as target.
1168 Yet, the most interesting results come from the new version of
1169 bfq, containing small changes to counter exactly the above
1170 weakness. This version recovers most of the throughput loss
1171 with the workload made of only random I/O and more; with the
1172 second workload, where the target is a sequential reader, it
1173 reaches about 3.7 times the total throughput of low-none.
1174
1175 When the main concern is the latency of flows containing short
1176 I/O, Linux seems now rather high performing, thanks to the bfq
1177 I/O scheduler and the I/O latency controller. But if the
1178 requirement is to provide explicit bandwidth guarantees (or
1179 just fairness) to I/O flows, then one must be ready to give up
1180 much or most of the speed of the storage media. bfq helps with
1181 some workloads, but loses most of the throughput with
1182 workloads consisting of mostly random I/O. Fortunately, there
1183 is apparently hope for much better performance since an
1184 improvement, still under development, seems to enable bfq to
1185 reach a high throughput with all workloads tested so far.
1186
1187 [ I wish to thank Vivek Goyal for enabling me to make this
1188 article much more fair and sound.]
1189
1190 [66]Comments (4 posted)
1191
1192 [67]KDE's onboarding initiative, one year later
1193
1194 August 24, 2018
1195
1196 This article was contributed by Marta Rybczyńska
1197
1198 [68]Akademy
1199
1200 In 2017, the KDE community decided on [69]three goals to
1201 concentrate on for the next few years. One of them was
1202 [70]streamlining the onboarding of new contributors (the
1203 others were [71]improving usability and [72]privacy ). During
1204 [73]Akademy , the yearly KDE conference that was held in
1205 Vienna in August, Neofytos Kolokotronis shared the status of
1206 the onboarding goal, the work done during the last year, and
1207 further plans. While it is a complicated process in a project
1208 as big and diverse as KDE, numerous improvements have been
1209 already made.
1210
1211 Two of the three KDE community goals were proposed by relative
1212 newcomers. Kolokotronis was one of those, having joined the
1213 [74]KDE Promo team not long before proposing the focus on
1214 onboarding. He had previously been involved with [75]Chakra
1215 Linux , a distribution based on KDE software. The fact that
1216 new members of the community proposed strategic goals was also
1217 noted in the [76]Sunday keynote by Claudia Garad .
1218
1219 Proper onboarding adds excitement to the contribution process
1220 and increases retention, he explained. When we look at [77]the
1221 definition of onboarding , it is a process in which the new
1222 contributors acquire knowledge, skills, and behaviors so that
1223 they can contribute effectively. Kolokotronis proposed to see
1224 it also as socialization: integration into the project's
1225 relationships, culture, structure, and procedures.
1226
1227 The gains from proper onboarding are many. The project can
1228 grow by attracting new blood with new perspectives and
1229 solutions. The community maintains its health and stays
1230 vibrant. Another important advantage of efficient onboarding
1231 is that replacing current contributors becomes easier when
1232 they change interests, jobs, or leave the project for whatever
1233 reason. Finally, successful onboarding adds new advocates to
1234 the project.
1235
1236 Achievements so far and future plans
1237
1238 The team started with ideas for a centralized onboarding
1239 process for the whole of KDE. They found out quickly that this
1240 would not work because KDE is "very decentralized", so it is
1241 hard to provide tools and procedures that are going to work
1242 for the whole project. According to Kolokotronis, other
1243 characteristics of KDE that impact onboarding are high
1244 diversity, remote and online teams, and hundreds of
1245 contributors in dozens of projects and teams. In addition, new
1246 contributors already know in which area they want to take part
1247 and they prefer specific information that will be directly
1248 useful for them.
1249
1250 So the team changed its approach; several changes have since
1251 been proposed and implemented. The [78]Get Involved page,
1252 which is expected to be one of the resources new contributors
1253 read first, has been rewritten. For the [79]Junior Jobs page ,
1254 the team is [80] [81]discussing what the generic content for
1255 KDE as a whole should be. The team simplified [82]Phabricator
1256 registration , which resulted in documenting the process
1257 better. Another part of the work includes the [83]KDE Bugzilla
1258 ; it includes, for example initiatives to limit the number of
1259 states of a ticket or remove obsolete products.
1260
1261 The [84]Plasma Mobile team is heavily involved in the
1262 onboarding goal. The Plasma Mobile developers have simplified
1263 their development environment setup and created an
1264 [85]interactive "Get Involved" page. In addition, the Plasma
1265 team changed the way task descriptions are written; they now
1266 contain more detail, so that it is easier to get involved. The
1267 basic description should be short and clear, and it should
1268 include details of the problem and possible solutions. The
1269 developers try to share the list of skills necessary to
1270 fulfill the tasks and include clear links to the technical
1271 resources needed.
1272
1273 Kolokotronis and team also identified a new potential source
1274 of contributors for KDE: distributions using KDE. They have
1275 the advantage of already knowing and using the software. The
1276 next idea the team is working on is to make sure that setting
1277 up a development environment is easy. The team plans to work
1278 on this during a dedicated sprint this autumn.
1279
1280 Searching for new contributors
1281
1282 Kolokotronis plans to search for new contributors at the
1283 periphery of the project, among the "skilled enthusiasts":
1284 loyal users who actually care about the project. They "can
1285 make wonders", he said. Those individuals may be also less
1286 confident or shy, have troubles making the first step, and
1287 need guidance. The project leaders should take that into
1288 account.
1289
1290 In addition, newcomers are all different. Kolokotronis
1291 provided a long list of how contributors differ, including
1292 skills and knowledge, motives and interests, and time and
1293 dedication. His advice is to "try to find their superpower",
1294 the skills they have that are missing in the team. Those
1295 "superpowers" can then be used for the benefit of the project.
1296
1297 If a project does nothing else, he said, it can start with its
1298 documentation. However, this does not only mean code
1299 documentation. Writing down the procedures or information
1300 about the internal work of the project, like who is working on
1301 what, is an important part of a project's documentation and
1302 helps newcomers. There should be also guidelines on how to
1303 start, especially setting up the development environment.
1304
1305 The first thing the project leaders should do, according to
1306 Kolokotronis, is to spend time on introducing newcomers to the
1307 project. Ideally every new contributor should be assigned
1308 mentors — more experienced members who can help them when
1309 needed. The mentors and project leaders should find tasks that
1310 are interesting for each person. Answering an audience
1311 question on suggestions for shy new contributors, he
1312 recommended even more mentoring. It is also very helpful to
1313 make sure that newcomers have enough to read, but "avoid
1314 RTFM", he highlighted. It is also easy for a new contributor
1315 "to fly away", he said. The solution is to keep requesting
1316 things and be proactive.
1317
1318 What the project can do?
1319
1320 Kolokotronis suggested a number of actions for a project when
1321 it wants to improve its onboarding. The first step is
1322 preparation: the project leaders should know the team's and
1323 the project's needs. Long-term planning is important, too. It
1324 is not enough to wait for contributors to come — the project
1325 should be proactive, which means reaching out to candidates,
1326 suggesting appropriate tasks and, finally, making people
1327 available for the newcomers if they need help.
1328
1329 This leads to next step: to be a mentor. Kolokotronis suggests
1330 being a "great host", but also trying to phase out the
1331 dependency on the mentor rapidly. "We have been all
1332 newcomers", he said. It can be intimidating to join an
1333 existing group. Onboarding creates a sense of belonging which,
1334 in turn, increases retention.
1335
1336 The last step proposed was to be strategic. This includes
1337 thinking about the emotions you want newcomers to feel.
1338 Kolokotronis explained the strategic part with an example. The
1339 overall goal is (surprise!) improve onboarding of new
1340 contributors. An intermediate objective might be to keep the
1341 newcomers after they have made their first commit. If your
1342 strategy is to keep them confident and proud, you can use
1343 different tactics like praise and acknowledgment of the work
1344 in public. Another useful tactic may be assigning simple
1345 tasks, according to the skill of the contributor.
1346
1347 To summarize, the most important thing, according to
1348 Kolokotronis, is to respond quickly and spend time with new
1349 contributors. This time should be used to explain procedures,
1350 and to introduce the people and culture. It is also essential
1351 to guide first contributions and praise contributor's skill
1352 and effort. Increase the difficulty of tasks over time to keep
1353 contributors motivated and challenged. And finally, he said,
1354 "turn them into mentors".
1355
1356 Kolokotronis acknowledges that onboarding "takes time" and
1357 "everyone complains" about it. However, he is convinced that
1358 it is beneficial in the long term and that it decreases
1359 developer turnover.
1360
1361 Advice to newcomers
1362
1363 Kolokotronis concluded with some suggestions for newcomers to
1364 a project. They should try to be persistent and to not get
1365 discouraged when something goes wrong. Building connections
1366 from the very beginning is helpful. He suggests asking
1367 questions as if you were already a member "and things will be
1368 fine". However, accept criticism if it happens.
1369
1370 One of the next actions of the onboarding team will be to
1371 collect feedback from newcomers and experienced contributors
1372 to see if they agree on the ideas and processes introduced so
1373 far.
1374
1375 [86]Comments (none posted)
1376
1377 [87]Sharing and archiving data sets with Dat
1378
1379 August 27, 2018
1380
1381 This article was contributed by Antoine Beaupré
1382
1383 [88]Dat is a new peer-to-peer protocol that uses some of the
1384 concepts of [89]BitTorrent and Git. Dat primarily targets
1385 researchers and open-data activists as it is a great tool for
1386 sharing, archiving, and cataloging large data sets. But it can
1387 also be used to implement decentralized web applications in a
1388 novel way.
1389
1390 Dat quick primer
1391
1392 Dat is written in JavaScript, so it can be installed with npm
1393 , but there are [90]standalone binary builds and a [91]desktop
1394 application (as an AppImage). An [92]online viewer can be used
1395 to inspect data for those who do not want to install arbitrary
1396 binaries on their computers.
1397
1398 The command-line application allows basic operations like
1399 downloading existing data sets and sharing your own. Dat uses
1400 a 32-byte hex string that is an [93]ed25519 public key , which
1401 is is used to discover and find content on the net. For
1402 example, this will download some sample data: $ dat clone \
1403
1404 dat://778f8d955175c92e4ced5e4f5563f69bfec0c86cc6f670352c457943-
1405 666fe639 \
1406
1407 ~/Downloads/dat-demo
1408
1409 Similarly, the share command is used to share content. It
1410 indexes the files in a given directory and creates a new
1411 unique address like the one above. The share command starts a
1412 server that uses multiple discovery mechanisms (currently, the
1413 [94]Mainline Distributed Hash Table (DHT), a [95]custom DNS
1414 server , and multicast DNS) to announce the content to its
1415 peers. This is how another user, armed with that public key,
1416 can download that content with dat clone or mirror the files
1417 continuously with dat sync .
1418
1419 So far, this looks a lot like BitTorrent [96]magnet links
1420 updated with 21st century cryptography. But Dat adds revisions
1421 on top of that, so modifications are automatically shared
1422 through the swarm. That is important for public data sets as
1423 those are often dynamic in nature. Revisions also make it
1424 possible to use [97]Dat as a backup system by saving the data
1425 incrementally using an [98]archiver .
1426
1427 While Dat is designed to work on larger data sets, processing
1428 them for sharing may take a while. For example, sharing the
1429 Linux kernel source code required about five minutes as Dat
1430 worked on indexing all of the files. This is comparable to the
1431 performance offered by [99]IPFS and BitTorrent. Data sets with
1432 more or larger files may take quite a bit more time.
1433
1434 One advantage that Dat has over IPFS is that it doesn't
1435 duplicate the data. When IPFS imports new data, it duplicates
1436 the files into ~/.ipfs . For collections of small files like
1437 the kernel, this is not a huge problem, but for larger files
1438 like videos or music, it's a significant limitation. IPFS
1439 eventually implemented a solution to this [100]problem in the
1440 form of the experimental [101]filestore feature , but it's not
1441 enabled by default. Even with that feature enabled, though,
1442 changes to data sets are not automatically tracked. In
1443 comparison, Dat operation on dynamic data feels much lighter.
1444 The downside is that each set needs its own dat share process.
1445
1446 Like any peer-to-peer system, Dat needs at least one peer to
1447 stay online to offer the content, which is impractical for
1448 mobile devices. Hosting providers like [102]Hashbase (which is
1449 a [103]pinning service in Dat jargon) can help users keep
1450 content online without running their own [104]server . The
1451 closest parallel in the traditional web ecosystem would
1452 probably be content distribution networks (CDN) although
1453 pinning services are not necessarily geographically
1454 distributed and a CDN does not necessarily retain a complete
1455 copy of a website. [105]
1456
1457 A web browser called [106]Beaker , based on the [107]Electron
1458 framework, can access Dat content natively without going
1459 through a pinning service. Furthermore, Beaker is essential to
1460 get any of the [108]Dat applications working, as they
1461 fundamentally rely on dat:// URLs to do their magic. This
1462 means that Dat applications won't work for most users unless
1463 they install that special web browser. There is a [109]Firefox
1464 extension called " [110]dat-fox " for people who don't want to
1465 install yet another browser, but it requires installing a
1466 [111]helper program . The extension will be able to load
1467 dat:// URLs but many applications will still not work. For
1468 example, the [112]photo gallery application completely fails
1469 with dat-fox.
1470
1471 Dat-based applications look promising from a privacy point of
1472 view. Because of its peer-to-peer nature, users regain control
1473 over where their data is stored: either on their own computer,
1474 an online server, or by a trusted third party. But considering
1475 the protocol is not well established in current web browsers,
1476 I foresee difficulties in adoption of that aspect of the Dat
1477 ecosystem. Beyond that, it is rather disappointing that Dat
1478 applications cannot run natively in a web browser given that
1479 JavaScript is designed exactly for that.
1480
1481 Dat privacy
1482
1483 An advantage Dat has over other peer-to-peer protocols like
1484 BitTorrent is end-to-end encryption. I was originally
1485 concerned by the encryption design when reading the
1486 [113]academic paper [PDF] :
1487
1488 It is up to client programs to make design decisions around
1489 which discovery networks they trust. For example if a Dat
1490 client decides to use the BitTorrent DHT to discover peers,
1491 and they are searching for a publicly shared Dat key (e.g. a
1492 key cited publicly in a published scientific paper) with known
1493 contents, then because of the privacy design of the BitTorrent
1494 DHT it becomes public knowledge what key that client is
1495 searching for.
1496
1497 So in other words, to share a secret file with another user,
1498 the public key is transmitted over a secure side-channel, only
1499 to then leak during the discovery process. Fortunately, the
1500 public Dat key is not directly used during discovery as it is
1501 [114]hashed with BLAKE2B . Still, the security model of Dat
1502 assumes the public key is private, which is a rather
1503 counterintuitive concept that might upset cryptographers and
1504 confuse users who are frequently encouraged to type such
1505 strings in address bars and search engines as part of the Dat
1506 experience. There is a [115]security & privacy FAQ in the Dat
1507 documentation warning about this problem:
1508
1509 One of the key elements of Dat privacy is that the public key
1510 is never used in any discovery network. The public key is
1511 hashed, creating the discovery key. Whenever peers attempt to
1512 connect to each other, they use the discovery key.
1513
1514 Data is encrypted using the public key, so it is important
1515 that this key stays secure.
1516
1517 There are other privacy issues outlined in the document; it
1518 states that " Dat faces similar privacy risks as BitTorrent ":
1519
1520 When you download a dataset, your IP address is exposed to the
1521 users sharing that dataset. This may lead to honeypot servers
1522 collecting IP addresses, as we've seen in Bittorrent. However,
1523 with dataset sharing we can create a web of trust model where
1524 specific institutions are trusted as primary sources for
1525 datasets, diminishing the sharing of IP addresses.
1526
1527 A Dat blog post refers to this issue as [116]reader privacy
1528 and it is, indeed, a sensitive issue in peer-to-peer networks.
1529 It is how BitTorrent users are discovered and served scary
1530 verbiage from lawyers, after all. But Dat makes this a little
1531 better because, to join a swarm, you must know what you are
1532 looking for already, which means peers who can look at swarm
1533 activity only include users who know the secret public key.
1534 This works well for secret content, but for larger, public
1535 data sets, it is a real problem; it is why the Dat project has
1536 [117]avoided creating a Wikipedia mirror so far.
1537
1538 I found another privacy issue that is not documented in the
1539 security FAQ during my review of the protocol. As mentioned
1540 earlier, the [118]Dat discovery protocol routinely phones home
1541 to DNS servers operated by the Dat project. This implies that
1542 the default discovery servers (and an attacker watching over
1543 their traffic) know who is publishing or seeking content, in
1544 essence discovering the "social network" behind Dat. This
1545 discovery mechanism can be disabled in clients, but a similar
1546 privacy issue applies to the DHT as well, although that is
1547 distributed so it doesn't require trust of the Dat project
1548 itself.
1549
1550 Considering those aspects of the protocol, privacy-conscious
1551 users will probably want to use Tor or other anonymization
1552 techniques to work around those concerns.
1553
1554 The future of Dat
1555
1556 [119]Dat 2.0 was released in June 2017 with performance
1557 improvements and protocol changes. [120]Dat Enhancement
1558 Proposals (DEPs) guide the project's future development; most
1559 work is currently geared toward implementing the draft "
1560 [121]multi-writer proposal " in [122]HyperDB . Without
1561 multi-writer support, only the original publisher of a Dat can
1562 modify it. According to Joe Hand, co-executive-director of
1563 [123]Code for Science & Society (CSS) and Dat core developer,
1564 in an IRC chat, "supporting multiwriter is a big requirement
1565 for lots of folks". For example, while Dat might allow Alice
1566 to share her research results with Bob, he cannot modify or
1567 contribute back to those results. The multi-writer extension
1568 allows for Alice to assign trust to Bob so he can have write
1569 access to the data.
1570
1571 Unfortunately, the current proposal doesn't solve the " hard
1572 problems " of " conflict merges and secure key distribution ".
1573 The former will be worked out through user interface tweaks,
1574 but the latter is a classic problem that security projects
1575 have typically trouble finding solutions for—Dat is no
1576 exception. How will Alice securely trust Bob? The OpenPGP web
1577 of trust? Hexadecimal fingerprints read over the phone? Dat
1578 doesn't provide a magic solution to this problem.
1579
1580 Another thing limiting adoption is that Dat is not packaged in
1581 any distribution that I could find (although I [124]requested
1582 it in Debian ) and, considering the speed of change of the
1583 JavaScript ecosystem, this is unlikely to change any time
1584 soon. A [125]Rust implementation of the Dat protocol has
1585 started, however, which might be easier to package than the
1586 multitude of [126]Node.js modules. In terms of mobile device
1587 support, there is an experimental Android web browser with Dat
1588 support called [127]Bunsen , which somehow doesn't run on my
1589 phone. Some adventurous users have successfully run Dat in
1590 [128]Termux . I haven't found an app running on iOS at this
1591 point.
1592
1593 Even beyond platform support, distributed protocols like Dat
1594 have a tough slope to climb against the virtual monopoly of
1595 more centralized protocols, so it remains to be seen how
1596 popular those tools will be. Hand says Dat is supported by
1597 multiple non-profit organizations. Beyond CSS, [129]Blue Link
1598 Labs is working on the Beaker Browser as a self-funded startup
1599 and a grass-roots organization, [130]Digital Democracy , has
1600 contributed to the project. The [131]Internet Archive has
1601 [132]announced a collaboration between itself, CSS, and the
1602 California Digital Library to launch a pilot project to see "
1603 how members of a cooperative, decentralized network can
1604 leverage shared services to ensure data preservation while
1605 reducing storage costs and increasing replication counts ".
1606
1607 Hand said adoption in academia has been "slow but steady" and
1608 that the [133]Dat in the Lab project has helped identify areas
1609 that could help researchers adopt the project. Unfortunately,
1610 as is the case with many free-software projects, he said that
1611 "our team is definitely a bit limited on bandwidth to push for
1612 bigger adoption". Hand said that the project received a grant
1613 from [134]Mozilla Open Source Support to improve its
1614 documentation, which will be a big help.
1615
1616 Ultimately, Dat suffers from a problem common to all
1617 peer-to-peer applications, which is naming. Dat addresses are
1618 not exactly intuitive: humans do not remember strings of 64
1619 hexadecimal characters well. For this, Dat took a [135]similar
1620 approach to IPFS by using DNS TXT records and /.well-known URL
1621 paths to bridge existing, human-readable names with Dat
1622 hashes. So this sacrifices a part of the decentralized nature
1623 of the project in favor of usability.
1624
1625 I have tested a lot of distributed protocols like Dat in the
1626 past and I am not sure Dat is a clear winner. It certainly has
1627 advantages over IPFS in terms of usability and resource usage,
1628 but the lack of packages on most platforms is a big limit to
1629 adoption for most people. This means it will be difficult to
1630 share content with my friends and family with Dat anytime
1631 soon, which would probably be my primary use case for the
1632 project. Until the protocol reaches the wider adoption that
1633 BitTorrent has seen in terms of platform support, I will
1634 probably wait before switching everything over to this
1635 promising project.
1636
1637 [136]Comments (11 posted)
1638
1639 Page editor : Jonathan Corbet
1640
1641 Inside this week's LWN.net Weekly Edition
1642
1643 [137]Briefs : OpenSSH 7.8; 4.19-rc1; Which stable?; Netdev
1644 0x12; Bison 3.1; Quotes; ...
1645
1646 [138]Announcements : Newsletters; events; security updates;
1647 kernel patches; ... Next page : [139]Brief items>>
1648
1649
1650
1651 [1] https://lwn.net/Articles/763743/
1652
1653 [2] https://lwn.net/Articles/763626/
1654
1655 [3] https://lwn.net/Articles/763641/
1656
1657 [4] https://lwn.net/Articles/763106/
1658
1659 [5] https://lwn.net/Articles/763603/
1660
1661 [6] https://lwn.net/Articles/763175/
1662
1663 [7] https://lwn.net/Articles/763492/
1664
1665 [8] https://lwn.net/Articles/763254/
1666
1667 [9] https://lwn.net/Articles/763255/
1668
1669 [10] https://lwn.net/Articles/763743/#Comments
1670
1671 [11] https://lwn.net/Articles/763626/
1672
1673 [12] http://julialang.org/
1674
1675 [13] https://julialang.org/blog/2018/08/one-point-zero
1676
1677 [14] https://julialang.org/benchmarks/
1678
1679 [15] https://juliacomputing.com/
1680
1681 [16] https://en.wikipedia.org/wiki/Read%E2%80%93eval%E2%80%93p-
1682 rint_loop
1683
1684 [17] http://llvm.org/
1685
1686 [18] http://www.3blue1brown.com/essence-of-linear-algebra-page/
1687
1688 [19] http://www.netlib.org/lapack/
1689
1690 [20] https://lwn.net/Articles/657157/
1691
1692 [21] https://julialang.org/publications/julia-fresh-approach-B-
1693 EKS.pdf
1694
1695 [22] https://lwn.net/Articles/738915/
1696
1697 [23] https://pypy.org/
1698
1699 [24] https://github.com/JuliaPy/PyCall.jl
1700
1701 [25] https://github.com/JuliaInterop/RCall.jl
1702
1703 [26] https://docs.julialang.org/en/stable/
1704
1705 [27] https://julialang.org/learning/
1706
1707 [28] http://bogumilkaminski.pl/files/julia_express.pdf
1708
1709 [29] https://docs.julialang.org/en/stable/manual/noteworthy-di-
1710 fferences/#Noteworthy-differences-from-Python-1
1711
1712 [30] https://lwn.net/Articles/746386/
1713
1714 [31] https://github.com/JuliaLang/IJulia.jl
1715
1716 [32] https://lwn.net/Articles/764001/
1717
1718 [33] https://lwn.net/Articles/763626/#Comments
1719
1720 [34] https://lwn.net/Articles/763641/
1721
1722 [35] https://lwn.net/Archives/ConferenceByYear/#2018-Linux_Sec-
1723 urity_Summit_NA
1724
1725 [36] https://events.linuxfoundation.org/events/linux-security-
1726 summit-north-america-2018/
1727
1728 [37] https://kernsec.org/wiki/index.php/Kernel_Self_Protection-
1729 _Project
1730
1731 [38] https://lwn.net/Articles/763644/
1732
1733 [39] https://raphlinus.github.io/programming/rust/2018/08/17/u-
1734 ndefined-behavior.html
1735
1736 [40] https://lwn.net/Articles/749064/
1737
1738 [41] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/-
1739 linux.git/commit/?id=02361bc77888
1740
1741 [42] https://lore.kernel.org/lkml/CA+55aFzCG-zNmZwX4A2FQpadafL-
1742 fEzK6CC=qPXydAacU1RqZWA@mail.gmail.com/T/#u
1743
1744 [43] https://lwn.net/Articles/758245/
1745
1746 [44] https://lwn.net/Articles/718888/
1747
1748 [45] https://lwn.net/Articles/744507/
1749
1750 [46] https://outflux.net/slides/2018/lss/danger.pdf
1751
1752 [47] https://lwn.net/Articles/763641/#Comments
1753
1754 [48] https://lwn.net/Articles/763106/
1755
1756 [49] https://lwn.net/Articles/763497/
1757
1758 [50] https://lwn.net/Articles/762566/
1759
1760 [51] https://lwn.net/Articles/761118/
1761
1762 [52] https://git.kernel.org/linus/d5791044d2e5749ef4de84161cec-
1763 5532e2111540
1764
1765 [53] https://lwn.net/ml/linux-kernel/20180630000253.70103-1-sq-
1766 ue@chromium.org/
1767
1768 [54] https://git.kernel.org/linus/771c035372a036f83353eef46dbb-
1769 829780330234
1770
1771 [55] https://lwn.net/Articles/745073/
1772
1773 [56] https://lwn.net/ml/linux-kernel/CA+55aFxFjAmrFpwQmEHCthHO-
1774 zgidCKnod+cNDEE+3Spu9o1s3w@mail.gmail.com/
1775
1776 [57] https://lwn.net/Articles/759499/
1777
1778 [58] https://lwn.net/Articles/762355/
1779
1780 [59] https://lwn.net/ml/linux-fsdevel/20180823223145.GK6515@Ze-
1781 nIV.linux.org.uk/
1782
1783 [60] https://lwn.net/Articles/763106/#Comments
1784
1785 [61] https://lwn.net/Articles/763603/
1786
1787 [62] https://lwn.net/Articles/601799/
1788
1789 [63] https://lwn.net/Articles/552904
1790
1791 [64] https://lwn.net/Articles/758963/
1792
1793 [65] http://algogroup.unimore.it/people/paolo/pub-docs/extende-
1794 d-lat-bw-throughput.pdf
1795
1796 [66] https://lwn.net/Articles/763603/#Comments
1797
1798 [67] https://lwn.net/Articles/763175/
1799
1800 [68] https://lwn.net/Archives/ConferenceByYear/#2018-Akademy
1801
1802 [69] https://dot.kde.org/2017/11/30/kdes-goals-2018-and-beyond
1803
1804 [70] https://phabricator.kde.org/T7116
1805
1806 [71] https://phabricator.kde.org/T6831
1807
1808 [72] https://phabricator.kde.org/T7050
1809
1810 [73] https://akademy.kde.org/
1811
1812 [74] https://community.kde.org/Promo
1813
1814 [75] https://www.chakralinux.org/
1815
1816 [76] https://conf.kde.org/en/Akademy2018/public/events/79
1817
1818 [77] https://en.wikipedia.org/wiki/Onboarding
1819
1820 [78] https://community.kde.org/Get_Involved
1821
1822 [79] https://community.kde.org/KDE/Junior_Jobs
1823
1824 [80] https://lwn.net/Articles/763189/
1825
1826 [81] https://phabricator.kde.org/T8686
1827
1828 [82] https://phabricator.kde.org/T7646
1829
1830 [83] https://bugs.kde.org/
1831
1832 [84] https://www.plasma-mobile.org/index.html
1833
1834 [85] https://www.plasma-mobile.org/findyourway
1835
1836 [86] https://lwn.net/Articles/763175/#Comments
1837
1838 [87] https://lwn.net/Articles/763492/
1839
1840 [88] https://datproject.org
1841
1842 [89] https://www.bittorrent.com/
1843
1844 [90] https://github.com/datproject/dat/releases
1845
1846 [91] https://docs.datproject.org/install
1847
1848 [92] https://datbase.org/
1849
1850 [93] https://ed25519.cr.yp.to/
1851
1852 [94] https://en.wikipedia.org/wiki/Mainline_DHT
1853
1854 [95] https://github.com/mafintosh/dns-discovery
1855
1856 [96] https://en.wikipedia.org/wiki/Magnet_URI_scheme
1857
1858 [97] https://blog.datproject.org/2017/10/13/using-dat-for-auto-
1859 matic-file-backups/
1860
1861 [98] https://github.com/mafintosh/hypercore-archiver
1862
1863 [99] https://ipfs.io/
1864
1865 [100] https://github.com/ipfs/go-ipfs/issues/875
1866
1867 [101] https://github.com/ipfs/go-ipfs/blob/master/docs/experim-
1868 ental-features.md#ipfs-filestore
1869
1870 [102] https://hashbase.io/
1871
1872 [103] https://github.com/datprotocol/DEPs/blob/master/proposal-
1873 s/0003-http-pinning-service-api.md
1874
1875 [104] https://docs.datproject.org/server
1876
1877 [105] https://lwn.net/Articles/763544/
1878
1879 [106] https://beakerbrowser.com/
1880
1881 [107] https://electronjs.org/
1882
1883 [108] https://github.com/beakerbrowser/explore
1884
1885 [109] https://addons.mozilla.org/en-US/firefox/addon/dat-p2p-p-
1886 rotocol/
1887
1888 [110] https://github.com/sammacbeth/dat-fox
1889
1890 [111] https://github.com/sammacbeth/dat-fox-helper
1891
1892 [112] https://github.com/beakerbrowser/dat-photos-app
1893
1894 [113] https://github.com/datproject/docs/raw/master/papers/dat-
1895 paper.pdf
1896
1897 [114] https://github.com/datprotocol/DEPs/blob/653e0cf40233b5d-
1898 474cddc04235577d9d55b2934/proposals/0000-peer-discovery.md#dis-
1899 covery-keys
1900
1901 [115] https://docs.datproject.org/security
1902
1903 [116] https://blog.datproject.org/2016/12/12/reader-privacy-on-
1904 the-p2p-web/
1905
1906 [117] https://blog.datproject.org/2017/12/10/dont-ship/
1907
1908 [118] https://github.com/datprotocol/DEPs/pull/7
1909
1910 [119] https://blog.datproject.org/2017/06/01/dat-sleep-release/
1911
1912 [120] https://github.com/datprotocol/DEPs
1913
1914 [121] https://github.com/datprotocol/DEPs/blob/master/proposal-
1915 s/0008-multiwriter.md
1916
1917 [122] https://github.com/mafintosh/hyperdb
1918
1919 [123] https://codeforscience.org/
1920
1921 [124] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=890565
1922
1923 [125] https://github.com/datrs
1924
1925 [126] https://nodejs.org/en/
1926
1927 [127] https://bunsenbrowser.github.io/#!index.md
1928
1929 [128] https://termux.com/
1930
1931 [129] https://bluelinklabs.com/
1932
1933 [130] https://www.digital-democracy.org/
1934
1935 [131] https://archive.org
1936
1937 [132] https://blog.archive.org/2018/06/05/internet-archive-cod-
1938 e-for-science-and-society-and-california-digital-library-to-pa-
1939 rtner-on-a-data-sharing-and-preservation-pilot-project/
1940
1941 [133] https://github.com/codeforscience/Dat-in-the-Lab
1942
1943 [134] https://www.mozilla.org/en-US/moss/
1944
1945 [135] https://github.com/datprotocol/DEPs/blob/master/proposal-
1946 s/0005-dns.md
1947
1948 [136] https://lwn.net/Articles/763492/#Comments
1949
1950 [137] https://lwn.net/Articles/763254/
1951
1952 [138] https://lwn.net/Articles/763255/
1953
1954 [139] https://lwn.net/Articles/763254/
1955
1956
1957